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Chapter 1 



Introduction 



A few short years ago, the applications for 
video were somewhat confined — analog was 
used for broadcast and cable television, VCRs, 
set-top boxes, televisions, and camcorders. 
Since then, there has been a tremendous and 
rapid conversion to digital video, mostly based 
on the MPEG-2 video compression standard. 

Today, in addition to the legacy DV, 
MPEG-1, and MPEG-2 audio and video com- 
pression standards, there are three new high- 
performance video compression standards. 
These new video codecs offer much higher 
video compression for a given level of video 
quality. 

• MPEG-4.2. This video codec typically 
offers a 1.5-2x improvement in com- 
pression ratio over MPEG-2. Able to 
address a wide variety of markets, 
MPEG-4.2 never really achieved wide- 
spread acceptance due to its complexity. 
Also, many simply decided to wait for 
the new MPEG-4.10 (H.264) video 
codec to become available. 



• MPEG-4.10 (H.264). This video codec 
typically offers a 2-3x improvement in 
compression ratio over MPEG-2. Addi- 
tional improvements in compression 
ratios and quality are expected as the 
encoders become better and use more 
of the available tools that MPEG-4.10 
(H.264) offers. Learning a lesson from 
MPEG-4, MPEG-4.10 (H.264) is opti- 
mized for implementing on low-cost sin- 
gle-chip solutions and has already been 
adopted by the DVB and ARIB. 

• SMPTE 42 1M (VC-1). A competitor to 
MPEG-4.10 (H.264), this video codec 
also typically offers a 2-3x improvement 
in compression ratios over MPEG-2. 
Again, additional improvements in com- 
pression ratios and quality are expected 
as the encoders become better. 
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Many more audio codecs are also available 
as a result of the interest in 6.1- and 7.1-chan- 
nel audio, multi-channel lossless compression, 
lower bit-rates for the same level of audio qual- 
ity, and finally, higher bit-rates for applications 
needing the highest audio quality at a reason- 
able bit-rate. 

In addition to decoding audio, real-time 
high-quality audio encoding is needed for 
DVD, HD DVD and Blu-ray recorders and digi- 
tal video recorders (DVRs). Combining all 
these audio requirements mandates that any 
single-chip solution for the consumer market 
incorporate a DSP for audio processing. 

Equipment for the consumer has also 
become more sophisticated, supporting a 
much wider variety of content and interconnec- 
tivity. Today we have: 

• HD DVD and Blu-ray Players and 
Recorders. In addition to playing CDs 
and DVDs, these advanced HD players 
also support the playback of MPEG-4.10 
(H.264), and SMPTE 421M (VC-1) con- 
tent. Some include an Ethernet connec- 
tion to enable content from a PC or 
media server to be easily enjoyed on the 
television. 

• Digital Media Adapters. These small, 
low-cost boxes use an Ethernet or 
802.11 connection to enable content 
from a PC or media server to be easily 
enjoyed on any television. Playback of 
MPEG-2, MPEG-4.10 (H.264), SMPTE 
421M (VC-1), and JPEG content is typi- 
cally supported. 



• Digital Set-Top Boxes. Cable and satellite 
set-top boxes are now including digital 
video recorder (DVR) capabilities, 
allowing viewers to enjoy content at 
their convenience. Use of MPEG-4.10 
(H.264) and SMPTE 42 1M (VC-1) now 
enables more channels of content and 
reduces the chance of early product 
obsolescence. 

• Digital Televisions (DTV). In addition to 
the tuners and decoders being incorpo- 
rated inside the television, some also 
include the digital media adapter capa- 
bility. Support for viewing on-line video 
content is also growing. 

• IPTV Set-Top Boxes. These low-cost set- 
top boxes are gaining popularity in 
regions that have high-speed DSL and 
FTTH (fiber to the home) available. Use 
of MPEG-4.10 (H.264) and SMPTE 
421M (VC-1) reduces the chance of 
early product obsolescence. 

• Portable Media Players. Using an inter- 
nal hard disc drive (HDD) , these play- 
ers connect to the PC via USB or 802.11 
network for downloading a wide variety 
of content. Playback of MPEG-2, MPEG- 
4.10 (H.264), SMPTE 421M (VC-1), and 
JPEG content is typically supported. 

• Mobile Video Receivers. Being incorpo- 
rated into cell phones, MPEG-4.10 
(H.264) and SMPTE 421M (VC-1) is 
used to transmit a high-quality video sig- 
nal. Example applications are the DMB, 
DVB-H and DVB-SH standards. 
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Of course, to make these advanced con- 
sumer products requires more than just sup- 
porting an audio and video codec. There is also 
the need to support: 

• Closed Captioning, Subtitles, Teletext, 
and V-Chip. These standards were 
updated to support digital broadcasts. 

• Advanced Video Processing. Due to the 
wide range of resolutions for both con- 
tent and displays, sophisticated high- 
quality scaling and motion adaptive 
deinterlacing are usually required. 

Since the standard-definition (SD) and 
high-definition (HD) standards use dif- 
ferent colorimetry standards, this also 
needs to be corrected when viewing SD 
content on an HDTV or HD content on 
an SDTV. 

• Sophisticated Image Composition. The 
ability to render a sophisticated image 
composed of a variety of video, OSD 
(on-screen display) , subtitle/ caption- 
ing/ subpicture, text, and graphics ele- 
ments. 

• ARIB and DVB over IP. The complexity 
of supporting IP video is increasing, 
with deployments now incorporating 
ARIB and DVB over IP. 

• Digital Rights Management (DRM) . The 
protection of content from unauthorized 
copying or viewing. 

This fifth edition of Video Demystified has 
been updated to reflect these changing times. 
Implementing real-world solutions is not easy, 
and many engineers have little knowledge or 
experience in this area. This book is a guide 
for those engineers charged with the task of 
understanding and implementing video fea- 
tures into next-generation designs. 



This book can be used by engineers who 
need or desire to learn about video, VLSI 
design engineers working on new video prod- 
ucts, or anyone who wants to evaluate or sim- 
ply know more about video systems. 



Contents 

The book is organized as follows: 

Chapter 2, an Introduction to Video, dis- 
cusses the various video formats and signals, 
where they are used, and the differences 
between interlaced and progressive video. 
Block diagrams of DVD players and digital set- 
top boxes are provided. 

Chapter 3 reviews the common Color 
Spaces, how they are mathematically related, 
and when a specific color space is used. Color 
spaces reviewed include RGB, YUV, YIQ, 
YCbCr, xvYCC, HSI, HSV, and HLS. Consider- 
ations for converting from a non-RGB to an 
RGB color space and gamma correction are 
also discussed. 

Chapter 4 is a Video Signals Overview that 
reviews the video timing and the analog and 
digital representations of various video for- 
mats, including 480i, 480p, 576i, 576p, 720p, 
1080i, and 1080p. 

Chapter 5 discusses the Analog Video 
Interfaces, including the analog RGB, YPbPr, S- 
Video, and SC ART interfaces for consumer 
and pro-video applications. 

Chapter 6 discusses the various Digital 
Video Interfaces for semiconductors, pro-video 
equipment, and consumer equipment. It 
reviews the BT.601 and BT.656 semiconductor 
interfaces; the SDI, SDTI, and HD-SDTI pro- 
video interfaces; and the DVI, HDMI, and 
IEEE 1394 consumer interfaces. 

Chapter 7 covers several Digital Video Pro- 
cessing requirements such as 4:4:4 to 4:2:2 
YCbCr, YCbCr digital filter templates, scaling, 
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interlaced/noninterlaced conversion, frame 
rate conversion, alpha mixing, flicker filtering, 
and chroma keying. Brightness, contrast, satu- 
ration, hue, and sharpness controls are also 
discussed. 

Chapter 8 provides an NTSC, PAL, and 
SECAM Overview. The various composite ana- 
log video signal formats are reviewed, along 
with video test signals. VBI data discussed 
includes timecode, closed captioning and 
extended data services (XDS) , widescreen sig- 
naling and teletext. In addition, PALplus, RF 
modulation, BTSC, and Zweiton analog stereo 
audio and NICAM 728 digital stereo audio are 
reviewed. 

Chapter 9 covers digital techniques used 
for the Encoding and Decoding of NTSC and 
PAL color video signals. Also reviewed are var- 
ious luma/chroma (Y/C) separation tech- 
niques and their trade-offs. 

Chapter 10 discusses the H.261 and H.263 
video compression standards used for video 
teleconferencing. 

Chapter 11 discusses the Consumer DV 
video compression standards used by digital 
camcorders. 



Chapter 12 reviews the MPEG-1 video 
compression standard. 

Chapter 13 discusses the MPEG-2 video 
compression standard. 

Chapter 14 discusses the MPEG-4 video 
compression standard, including MPEG-4.10 
(H.264) . 

Chapter 15 discusses the ATSC Digital 
Television standard used in the United States. 

Chapter 16 discusses the OpenCable ™ Digi- 
tal Television standard used in the United 
States. 

Chapter 17 discusses the DVB Digital Tele- 
vision standard used in Europe and Asia. 

Chapter 18 discusses the ISIJB Digital 
Television standard used in Japan. 

Chapter 19 discusses IPTV. This technol- 
ogy sends compressed video over broadband 
networks such as Internet, DSL, FTTH (Fiber 
To The Home) , etc. 

Finally, Chapter 20 is a glossary of over 
400 video terms. If you encounter an unfamiliar 
term, it likely will be defined in the glossary. 
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Standards Organizations 

Many standards organizations, some of 
which are listed below, are involved in specify- 
ing video standards. 

Advanced Television Systems 
Committee (ATSC) 

www.atsc.org 

Association of Radio Industries and 
Businesses (ARIB) 

www.arib.or.jp 

Cable Television Laboratories 

www.cablelabs.com 

Consumer Electronics Associations 
(CEA) 

www.ce.org 

Digital Video Broadcasting (DVB) 

www.dvb.org 

Electronic Industries Alliance (EIA) 

www.eia.org 

European Broadcasting Union (EBU) 

www.ebu.ch 

European Telecommunications 
Standards Institute (ETSI) 

www.etsi.org 



International Electrotechnical 
Commission (IEC) 

www.iec.ch 

Institute of Electrical and Electronics 
Engineers (IEEE) 

www.ieee.org 

International Organization for 
Standardization (ISO) 

www.iso.org 

International Telecommunication Union 
(ITU) 

www.itu.int 

Society of Cable Telecommunications 
Engineers (SCTE) 

www.scte.org 

Society of Motion Picture and Television 
Engineers (SMPTE) 

www.smpte.org 

Video Electronics Standards Association 
(VESA) 

www.vesa.org 




Chapter 2 



Introduction 
to Video 



Although there are many variations and 
implementation techniques, video signals are 
just a way of transferring visual information 
from one point to another. The information 
may be from a VCR, DVD player, a channel on 
the local broadcast, cable television, or satellite 
system, the Internet, or one of many other 
sources. 

Invariably, the video information must be 
transferred from one device to another. It 
could be from a satellite set-top box or DVD 
player to a television. Or it could be from one 
chip to another inside the satellite set-top box 
or television. Although it seems simple, there 
are many different requirements, and there- 
fore many different ways of doing it. 



Analog vs. Digital 

Until a few years ago, most video equip- 
ment was designed primarily for analog video. 
Digital video was confined to professional 
applications, such as video editing. 



The average consumer now uses digital 
video every day thanks to continuing falling 
costs. This trend has led to the development of 
DVD players and recorders, digital set-top 
boxes, digital television (DTV) , portable video 
players, and the ability to use the Internet for 
transferring video data. 



Video Data 

Initially, video contained only gray-scale 
(also called black-and-white) information. 

While color broadcasts were being devel- 
oped, attempts were made to transmit color 
video using analog RGB (red, green, blue) 
data. However, this technique occupied 3x 
more bandwidth than the current gray-scale 
solution, so alternate methods were developed 
that led to using Y, R-Y, and G-Y data to repre- 
sent color information. A technique was then 
developed to transmit this Y, R-Y, and G-Y 
information using one signal, instead of three 
separate signals, and in the same bandwidth as 
the original gray-scale video signal. This com- 
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posite video signal is what the NTSC, PAL, and 
SECAM video standards are still based on 
today. This technique is discussed in more 
detail in Chapters 8 and 9. 

Today, even though there are many ways 
of representing video, they are still all related 
mathematically to RGB. These variations are 
discussed in more detail in Chapter 3. 

S-Video was developed for connecting con- 
sumer equipment together (it is not used for 
broadcast purposes) . It is a set of two analog 
signals, one gray-scale (Y) and one that carries 
the analog R-Y and B-Y color information in a 
specific format (also called C or chroma). 
Once available only for S-VHS, it is now sup- 
ported on most consumer video products. This 
is discussed in more detail in Chapter 9. 

Although always used by the professional 
video market, analog RGB video data has made 
a temporary comeback for connecting high- 
end consumer equipment together. Like S- 
Video, it is not used for broadcast purposes. 

A variation of the Y, R-Y, and G-Y video 
signals, called YPbPr, is now commonly used 
for connecting consumer video products 
together. Its primary advantage is the ability to 
transfer high-definition video between con- 
sumer products. Some manufacturers incor- 
rectly label the YPbPr connectors YUV, YCbCr, 
or Y (B-Y) (R-Y) . 

Chapter 5 discusses the various analog 
interconnect schemes in detail. 

Digital Video 

The most common digital signals used are 
RGB and YCbCr. RGB is simply the digitized 
version of the analog RGB video signals. 
YCbCr is basically the digitized version of the 
analog YPbPr video signals, and is the format 
used by DVD and digital television. 

Chapter 6 further discusses the various 
digital interconnect schemes. 



Best Connection Method 

There is always the question of “what is 
the best connection method for equipment?” 
For DVD players and digital cable/satellite/ 
terrestrial set-top boxes, the typical order of 
decreasing video quality is: 

f. HDMI (digital YCbCr) 

2. HDMI (digital RGB) 

3. Analog YPbPr 

4. Analog RGB 

5. Analog S-Video 

6. Analog Composite 

Some will disagree about the order. How- 
ever, most consumer products do digital video 
processing in the YCbCr color space. There- 
fore, using YCbCr as the interconnect for 
equipment reduces the number of color space 
conversions required. Color space conversion 
of digital signals is still preferable to D/A (digi- 
tal-to-analog) conversion followed by A/D 
(analog-to-digital) conversion, hence the posi- 
tioning of HDMI RGB above analog YPbPr. 

The computer industry has standardized 
on analog and digital RGB for connecting to 
the computer monitor. 



Video Timing 

Although it looks like video is continuous 
motion, it is actually a series of still images, 
changing fast enough that it looks like continu- 
ous motion, as shown in Figure 2.1. This typi- 
cally occurs 50 or 60 times per second for 
consumer video, and 70-90 times per second 
for computer displays. Special timing informa- 
tion, called vertical sync, is used to indicate 
when a new image is starting. 
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Figure 2.1. Video Is Composed of a Series of Still 
lines of data. 



Each still image is also composed of scan 
lines, lines of data that occur sequentially one 
after another down the display, as shown in 
Figure 2.1. Additional timing information, 
called horizontal sync, is used to indicate when 
a new scan line is starting. 

The vertical and horizontal sync informa- 
tion is usually transferred in one of three ways: 

1. Separate horizontal and vertical sync signals 

2. Separate composite sync signal 

3. Composite sync signal embedded within the 
video signal 

The composite sync signal is a combina- 
tion of both vertical and horizontal sync. 

Computer and consumer equipment that 
uses analog RGB video usually uses technique 
1 or 2. Consumer equipment that supports 
composite video or analog YPbPr video usually 
uses technique 3. 



Images. Each image is composed of individual 



For digital video, either technique 1 is 
commonly used or timing code words are 
embedded within the digital video stream. This 
is discussed in Chapter 6. 

Interlaced vs. Progressive 

Since video is a series of still images, it 
makes sense to simply display each full image 
consecutively, one after the another. 

This is the basic technique of progressive, 
or non-interlaced, displays. For progressive 
displays that “paint” an image on the screen, 
such as a CRT, each image is displayed start- 
ing at the top left corner of the display, moving 
to the right edge of the display. Then scanning 
then moves down one line, and repeats scan- 
ning left-to-right. This process is repeated until 
the entire screen is refreshed, as seen in Fig- 
ure 2.2. 
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In the early days of television, a technique 
called “interlacing” was used to reduce the 
amount of information sent for each image. By 
transferring the odd-numbered lines, followed 
by the even-numbered lines (as shown in Fig- 
ure 2.3), the amount of information sent for 
each image was halved. 

Given this advantage of interlacing, why 
bother to use progressive? 

With interlace, each scan line is refreshed 
half as often as it would be if it were a progres- 
sive display. Therefore, to avoid line flicker on 
sharp edges due to a too-low frame rate, the 
line-to-line changes are limited, essentially by 
vertically lowpass fdtering the image. A pro- 
gressive display has no limit on the line-to-line 
changes, so is capable of providing a higher- 
resolution image (vertically) without flicker. 

Today, most broadcasts (including HDTV) 
are still transmitted as interlaced. Most CRT- 
based displays are still interlaced while LCD, 
plasma, and computer displays are progres- 
sive. 



Video Resolution 

Video resolution is one of those “fuzzy” 
things in life. It is common to see video resolu- 
tions of 720 x 480 or 1920 x 1080. However, 
those are just the number of horizontal sam- 
ples and vertical scan lines, and do not neces- 
sarily convey the amount of useful information. 

For example, an analog video signal can be 
sampled at 13.5 MHz to generate 720 samples 
per line. Sampling the same signal at 27 MHz 
would generate 1440 samples per line. How- 
ever, only the number of samples per line has 
changed, not the resolution of the content. 



Therefore, video is usually measured 
using lines of resolution. In essence, how many 
distinct black and white vertical lines can be 
seen across the display? This number is then 
normalized to a 1:1 display aspect ratio (divid- 
ing the number by 3/ 4 for a 4:3 display, or by 
9/16 for a 16:9 display). Of course, this results 
in a lower value for widescreen (16:9) displays, 
which goes against intuition. 

Standard-Definition 

Standard-definition video is usually defined 
as having 480 or 576 interlaced active scan 
lines, and is commonly called “480i” and “576i,” 
respectively. 

For a fixed-pixel (non-CRT) consumer dis- 
play with a 4:3 aspect ratio, this translates into 
an active resolution of 720 x 480i or 720 x 576i. 
For a 16:9 aspect ratio, this translates into an 
active resolution of 960 x 480i or 960 x 576i. 

Enhanced-Definition 

Enhanced-definition video is usually 
defined as having 480 or 576 progressive active 
scan lines, and is commonly called “480p” and 
“576p,” respectively. 

For a fixed-pixel (non-CRT) consumer dis- 
play with a 4:3 aspect ratio, this translates into 
an active resolution of 720 x 480p or 720 x 
576p. For a 16:9 aspect ratio, this translates 
into an active resolution of 960 x 480p or 960 x 
576p. 

The difference between standard and 
enhanced definition is that standard-definition 
is interlaced, while enhanced-definition is pro- 
gressive. 
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VERTICAL HORIZONTAL 

SCANNING SCANNING 



Figure 2.2. Progressive Displays “Paint” the Lines of an Image Consecutively, One After Another. 



VERTICAL 

SCANNING 



HORIZONTAL 
SCANNING 
FIELD 1 



HORIZONTAL 
SCANNING 
FIELD 2 



Figure 2.3. Interlaced Displays “Paint” First One-Half of the Image (Odd Lines), Then the Other 
Half (Even Lines). 
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High-Definition 

High-definition video is usually defined as 
having 720 progressive (720p) or 1080 inter- 
laced (10801) active scan lines. For a fixed-pixel 
(non-CRT) consumer display with a 16:9 
aspect ratio, this translates into an active reso- 
lution of 1280 x 720p or 1920 x 1080i, respec- 
tively. 

However, HDTV displays are technically 
defined as being capable of displaying a mini- 
mum of 720p or 1080i active scan lines. They 
also must be capable of displaying 16:9 content 
using a minimum of 540 progressive (540p) or 
810 interlaced (810i) active scan lines. This 
enables the manufacturing of CRT-based 
HDTVs with a 4:3 aspect ratio and LCD/ 
plasma 16:9 aspect ratio displays with resolu- 
tions of 1024 x 1024p, 1280 x 768p, 1024 x 768p, 
and so on, lowering costs. 

Audio and Video 
Compression 

The recent advances in consumer electron- 
ics, such as digital television, DVD players and 
recorders, digital video recorders, and so on, 
were made possible due to audio and video 
compression based largely on MPEG-2 video 
with Dolby® Digital, DTS®, MPEG-1, or 
MPEG-2 audio. 

New audio and video codecs, such as 
MPEG-4 HE-AAC, MPEG-4.10 (H.264), and 
SMPTE 42 1M (VC-1), offer better compres- 
sion than previous codecs for the same quality. 
These advances are enabling new ways of dis- 
tributing content (both to consumers and 
within the home), new consumer products 
(such as portable video players and mobile 
video/cell phones), and more cable/satellite 
channels. 



Application Block Diagrams 

Looking at a few simplified block diagrams 
helps envision how video flows through its var- 
ious operations. 

DVD Players 

Figure 2.4 is a simplified block diagram for 
a basic DVD player, showing the common 
blocks. Today, all of this is on a single low-cost 
chip. 

In addition to playing DVDs (which are 
based on MPEG-2 video compression), DVD 
players are now expected to handle MP3 and 
WMA audio, MPEG-4 video (for DivX Video), 
JPEG images, and so on. Special playback 
modes such as slow/fast forward/reverse at 
various speeds are also expected. Support for 
DVD Audio and SACD is also popular. 

A recent enhancement to DVD players is 
the ability to connect to a home network for 
playing content (music, video, pictures, etc.) 
residing on the PC. These “networked DVD 
players” may also include the ability to play 
movies from the Internet and download con- 
tent onto an internal hard disc drive (HDD) for 
later viewing. Support for playing audio, video, 
and pictures from a variety of flash-memory 
cards is also growing. 

In an attempt to look different to quickly 
grab buyers’ attention, some DVD player man- 
ufacturers tweak the video frequency 
response. Since this feature is usually irritating 
over the long term, it should be defeated or 
properly adjusted. For the film look many 
video enthusiasts strive for, the frequency 
response should be as flat as possible. 

Another issue is the output levels of the 
analog video signals. Although it is easy to gen- 
erate very accurate video levels, they vary con- 
siderably. Reviews now point out this issue 
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Figure 2.4. Simplified Block Diagram of a Basic DVD Player. 



since switching between sources may mean 
changing brightness or black levels, defeating 
any television calibration or personal adjust- 
ments that may have been done by the user. 

Digital Media Adapters 

Digital media adapters connect to a home 
network for playing content (music, video, pic- 
tures, and so on) residing on a PC or media 
server. These small, low-cost boxes enable 
content to be easily enjoyed on any or all televi- 
sions in the home. Many support optional wire- 
less networking, simplifying installation. 

Figure 2.5 is a simplified block diagram for 
a basic digital media adapter, showing the com- 
mon blocks. Today, all of this is on a single low- 
cost chip. 



Digital Television Set-Top Boxes 

The digital television standards fall into 
seven major categories: 

ATSC (Advanced Television Systems Committee) 

DVB (Digital Video Broadcast) 

ARIB (Association of Radio Industries and Busi- 
nesses) 

IPTV (including DVB and ARIB over IP) 

Open digital cable standards, such as OpenCable 
Proprietary digital cable standards 
Proprietary digital satellite standards 
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Figure 2.5. Simplified Block Diagram of a Digital Media Adapter. 



Originally based on MPEG-2 video and 
Dolby® Digital or MPEG audio, they now sup- 
port new advanced audio and video standards, 
such as MPEG-4 HE-AAC audio, Dolby® Digi- 
tal Plus audio, MPEG-4.10 (H.264) video, and 
SMPTE 421M (VC-1) video. 

Figure 2.6 is a simplified block diagram for 
a digital television set-top box, showing the 
common audio and video processing blocks. It 
is used to receive digital television broadcasts, 
from either terrestrial (over-the-air) , cable, or 
satellite. A digital television may include this 
circuitry inside the television. 



Many set-top boxes now include two tun- 
ers and digital video recorder (DVR) capability. 
This enables recording one program onto an 
internal HDD while watching another. Two 
tuners are also common in digital television 
receivers to support a picture-in-picture (PIP) 
feature. 
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Figure 2.6. Simplified Block Diagram of a Digital Television Set-Top Box 



Chapter 3 



Color Spaces 



A color space is a mathematical represen- 
tation of a set of colors. The three most popular 
color models are RGB (used in computer 
graphics); YIQ, YUV, or YCbCr (used in video 
systems) ; and CMYK (used in color printing) . 
However, none of these color spaces is directly 
related to the intuitive notions of hue, satura- 
tion, and brightness. This resulted in the tem- 
porary pursuit of other models, such as HSI 
and HSV, to simplify programming, process- 
ing, and end-user manipulation. 

All of the color spaces can be derived from 
the RGB information supplied by devices such 
as cameras and scanners. 



RGB Color Space 

The red, green, and blue (RGB) color 
space is widely used for computer graphics 
and displays. Red, green, and blue are three 
primary additive colors (individual compo- 
nents are added together to form a desired 
color) and are represented by a three-dimen- 
sional, Cartesian coordinate system (Figure 
3.1). The indicated diagonal of the cube, with 
equal amounts of each primary component, 
represents various gray levels. Table 3.1 con- 
tains the RGB values for 100% amplitude, 100% 
saturated color bars, a common video test sig- 
nal. 



BLUE CYAN 




GREEN 



Figure 3.1. The RGB Color Cube. 
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Nominal 

Range 


White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


R 


0 to 255 


255 


255 


0 


0 


255 


255 


0 


0 


G 


0 to 255 


255 


255 


255 


255 


0 


0 


0 


0 


B 


0 to 255 


255 


0 


255 


0 


255 


0 


255 


0 



Table 3.1. 100% RGB Color Bars. 



The RGB color space is the most prevalent 
choice for computer graphics because color 
displays use red, green, and blue to create the 
desired color. Therefore, the choice of the 
RGB color space simplifies the architecture 
and design of the system. Also, a system that is 
designed using the RGB color space can take 
advantage of a large number of existing soft- 
ware routines, since this color space has been 
around for a number of years. 

However, RGB is not very efficient when 
dealing with real-world images. All three RGB 
components need to be of equal bandwidth to 
generate any color within the RGB color cube. 
The result of this is a frame buffer that has the 
same pixel depth and display resolution for 
each RGB component. Also, processing an 
image in the RGB color space is usually not the 
most efficient method. For example, to modify 
the intensity or color of a given pixel, the three 
RGB values must be read from the frame 
buffer, the intensity or color calculated, the 
desired modifications performed, and the new 
RGB values calculated and written back to the 
frame buffer. If the system had access to an 
image stored directly in the intensity and color 
format, some processing steps would be faster. 



For these and other reasons, many video 
standards use luma and two color difference 
signals. The most common are the YUV, YIQ, 
and YCbCr color spaces. Although all are 
related, there are some differences. 

sRGB 

Due to the many implementations of the 
RGB color space, the sRGB color space was 
formalized. The specification for sRGB (IEC 
61966-2-1) uses BT.709 chromaticity, D65 ref- 
erence white, a display gamma of 2.2, and lin- 
ear RGB (8 bits per color) . 

sRGB values have a normalized range of 0- 
1, with 8-bit digital sRGB values having a range 
of 0-255 for black-white. A version called “Stu- 
dio RGB” uses an 8-bit range of 16-235 for 
black-white, enabling compatibility with video 
applications. 

One limitation of sRGB is that since the 
normalized values are restricted to the 0-1 
range, colors outside the gamut (the triangle 
produced by them) cannot be used. For this 
reason, the extended RGB color space, 
“scRGB,” was developed. 
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scRGB 

The scRGB color space (formerly called 
sRGB64) extends the dynamic range, color 
gamut, and bit precision over sRGB. The 
scRGB gamut is not only much larger than the 
sRGB gamut, but it is larger than what the 
human visual system can see. The specifica- 
tion for scRGB (IEC 61966-2-2) uses BT.709 
chromaticity, D65 reference white, and linear 
RGB data (16 bits per color). 

Instead of using a normalized range of 0-1, 
a range of -0.5 to +7.4999 is supported. Values 
below 0 and above 1 are what enable scRGB to 
have a larger gamut, compared to sRGB, even 
though it has the same primary colors. The 
correlation between the linear 16-bit scRGB 
values and normalized range are: 

00000 = -0.5 
04096 = 0.0 (black) 

12288 = 1.0 (white) 

16384 = 1.5 
65535 = 7.4999 

After gamma correction, the correlation 
between the nonlinear 16-bit scRGB' values 
and normalized range are: 

00000 = -0.7354 
04096 = 0.0 (black) 

12288 = 1.0 (white) 

65535 = 2.3876 

scRGB to sRGB Conversion 

To convert linear 16-bit scRGB to gamma- 
corrected 8-bit sRGB (notated as sR'G'B'g): 



scR=(scR 16 / 8192) -0.5 
scG = (scG 16 / 8192) -0.5 
scB= (scB 16 / 8192) -0.5 
if (scR 16 , scG 16 , scB 16 ) < 4095 
sR'g = 0 
sG g = 0 
sB'g = 0 

if 4096 < (scR 1(i , scGjg, scB lfi ) < 4243 
sR'g = round [4.500 x scR x 255] 
sG'g = round[4.500 x scG x 255] 
sB'g = round [4.500 x scB x 255] 
if 4244 < (scR^g, scGjg, scB^g) < 12288 

sR'g = round[(1.099 x scR 045 - 0.099) x 255] 
sG'g = round [(1.099 x scG 0 45 - 0.099) x 255] 
sB'g = round [(1.099 x scB 045 - 0.099) x 255] 

if (scRjg, scG^g, scBig) > 12289 
sR'g = 255 
sG'g = 255 
sB'g = 255 

YUV Color Space 

The YUV color space is used by the PAL 
(Phase Alternation Line), NTSC (National 
Television System Committee), and SECAM 
(Sequentiel Couleur Avec Memoire or Sequen- 
tial Color with Memory) composite color video 
standards. The black-and-white system used 
only luma (Y) information; color information 
(U and V) was added in such a way that a 
black-and-white receiver would still display a 
normal black-and-white picture. Color receiv- 
ers decoded the additional color information to 
display a color picture. 
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The basic equations to convert between 
gamma-corrected RGB (notated as R G B ) 
and YUV are: 

Y = 0.299R' + 0.587G' + 0.114B' 

U = - 0.147R' - 0.289G' + 0.436B' 

= 0.492 (B'-Y) 

V = 0.615R' - 0.515G' - 0.100B' 

= 0.877 (R'-Y) 

R'=Y+ 1.140V 

G' = Y-0.395U- 0.581V 

B' = Y + 2.032U 

For digital R G B' values with a range of 0- 
255, Y has a range of 0-255, U a range of 0 to 
+112, and Y a range of 0 to +157. These equa- 
tions are usually scaled to simplify the imple- 
mentation in an actual NTSC or PAL digital 
encoder or decoder. 

Note that for digital data, 8-bit YUV and 
R G B' data should be saturated at the 0 and 
255 levels to avoid underflow and overflow 
wrap-around problems. 

If the full range of (B ' - Y) and (R' - Y) had 
been used, the composite NTSC and PAL lev- 
els would have exceeded what the (then cur- 
rent) black-and-white television transmitters 
and receivers were capable of supporting. 
Experimentation determined that modulated 
subcarrier excursions of 20% of the luma (Y) 
signal excursion could be permitted above 
white and below black. The scaling factors 
were then selected so that the maximum level 
of 75% amplitude, 100% saturation yellow and 
cyan color bars would be at the white level 
(100 IRE). 



YIQ Color Space 

The YIQ color space, further discussed in 
Chapter 8, is derived from the YUV color space 
and is optionally used by the NTSC composite 
color video standard. (The “I” stands for “in- 
phase” and the “Q” for “quadrature,” which is 
the modulation method used to transmit the 
color information.) The basic equations to con- 
vert between R G B' and YIQ are: 

Y = 0.299R' + 0.587G' + 0.114B' 

I = 0.596R' - 0.275G' - 0.321B' 

= Vcos 33° - Usin 33° 

= 0.736 (R'-Y) -0.268 (B'-Y) 

Q = 0.212R' - 0.523G' + 0.311B' 

= Vsin 33° + Ucos 33° 

= 0.478 (R' - Y) + 0.413 (B' - Y) 

or, using matrix notation: 



I 




0 1 


cos(33) sin(33) 


U 


Q 




1 o 


-sin(33) cos(33) 


V 



R' = Y + 0.9561 + 0.621Q 
G' = Y- 0.2721 -0.647Q 
B' = Y- 1.1071 + 1.704Q 

For digital R'G 'B ' values with a range of 0- 
255, Y has a range of 0-255, 1 has a range of 0 
to +152, and Q has a range of 0 to +134. 1 and Q 
are obtained by rotating the U and V axes 33°. 
These equations are usually scaled to simplify 
the implementation in an actual NTSC digital 
encoder or decoder. 

Note that for digital data, 8-bit YIQ and 
R G B' data should be saturated at the 0 and 
255 levels to avoid underflow and overflow 
wrap-around problems. 
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YCbCr Color Space 

The YCbCr color space was developed as 
part of ITU-R BT.601 during the development 
of a world-wide digital component video stan- 
dard (discussed in Chapter 4). YCbCr is a 
scaled and offset version of the YUV color 
space. Y is defined to have a nominal 8-bit 
range of 16-235; Cb and Cr are defined to have 
a nominal range of 16-240. There are several 
YCbCr sampling formats, such as 4:4:4, 4:2:2, 
4:1:1, and 4:2:0 that are also described. 

RGB-YCbCr Equations: SDTV 

RGB to YCbCr: Analog Equations 

Many specifications assume the source is 
analog R G B' with a normalized range of 0-1. 
This is first converted to analog YPbPr: 



Y = 0.299R' + 0.587G' + 0.114B' 

Pb = -0.169R' - 0.331G' + 0.500B' 

Pr = 0.500R' - 0.419G' - 0.081B' 

To generate 8-bit YCbCr with the proper 
values, YPbPr is then quantized to 8 bits: 

Y = round [219Y + 16] 

Cb = round [224Pb + 128] 

Cr = round [224Pr + 128] 

RGB to YCbCr: Digital Equations 

To convert 8-bit digital R G B' data with a 
16-235 nominal range (Studio RG B ) to 
YCbCr, the analog equations may be simplified 
to: 

Y = 0.299R' + 0.587G' + 0.114B' 

Cb = -0.172R' - 0.339G' + 0.511B' + 128 
Cr = 0.511R' - 0.428G' - 0.083B' + 128 





Nominal 

Range 


White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


SDTV 


Y 


16 to 235 


180 


162 


131 


112 


84 


65 


35 


16 


Cb 


16 to 240 


128 


44 


156 


72 


184 


100 


212 


128 


Cr 


16 to 240 


128 


142 


44 


58 


198 


212 


114 


128 


HDTV 


Y 


16 to 235 


180 


168 


145 


133 


63 


51 


28 


16 


Cb 


16 to 240 


128 


44 


147 


63 


193 


109 


212 


128 


Cr 


16 to 240 


128 


136 


44 


52 


204 


212 


120 


128 



Table 3.2. 75% YCbCr Color Bars. 
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YCbCr to RGB: Analog Equations 

Many specifications assume the source is 
analog YPbPr. This is first converted to analog 
RGB': 

R'=Y+ 1.402Pr 

G' = Y - 0.714Pr - 0.344Pb 

B' = Y+ 1.772Pb 

To generate 8-bit RG B' with a 16-235 
nominal range (Studio RGB), R G B' is then 
quantized to 8 bits: 

out' = round[219in' + 16] 

YCbCr to RGB: Digital Equations 

To convert 8-bit YCbCr to R G B' data with 
a 16-235 nominal range (Studio R'G'B'), the 
analog equations may be simplified to: 

R' =Y+ 1.371 (Cr- 128) 

G' = Y - 0.698(Cr - 128) - 0.336(Cb - 128) 

B' = Y + 1.732 (Cb - 128) 

YCbCr to RGB: General Considerations 

When performing YCbCr to R G B' con- 
version, the resulting RG B' values have a 
nominal range of 16-235, with possible occa- 
sional excursions into the 0-15 and 236-255 
values. This is due to Y and CbCr occasionally 
going outside the 16-235 and 16-240 ranges, 
respectively, due to video processing and 
noise. Note that 8-bit YCbCr and R G B' data 
should be saturated at the 0 and 255 levels to 
avoid underflow and overflow wrap-around 
problems. 

Table 3.2 lists the YCbCr values for 75% 
amplitude, 100% saturated color bars, a com- 
mon video test signal. 



Computer Systems Considerations 

If the R G B' data has a range of 0-255, as 
is commonly found in computer systems, the 
following equations may be more convenient 
to use: 

Y = 0.257R' + 0.504G' + 0.098B' + 16 
Cb = -0.148R' - 0.291G' + 0.439B' + 128 
Cr = 0.439R' - 0.368G' - 0.071B' + 128 

R' = 1.164(Y - 16) + 1.596(Cr- 128) 

G' = 1.164(Y - 16) - 0.813 (Cr - 128) - 
0.391 (Cb - 128) 

B' = 1.164(Y- 16) + 2.018(Cb - 128) 

Note that 8-bit YCbCr and RG B' data 
should be saturated at the 0 and 255 levels to 
avoid underflow and overflow wrap-around 
problems. 

RGB-YCbCr Equations: HDTV 

RGB to YCbCr: Analog Equations 

Many specifications assume the source is 
analog R'G'B' with a normalized range of 0-1. 
This is first converted to analog YPbPr: 

Y = 0.213R' + 0.715G' + 0.072B' 

Pb = -0.115R' - 0.385G' + 0.500B' 

Pr = 0.500R' - 0.454G' - 0.046B' 

To generate 8-bit YCbCr with the proper 
values, YPbPr is then quantized to 8 bits: 

Y = round [219Y + 16] 

Cb = round [224Pb + 128] 

Cr= round [224Pr + 128] 
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RGB to YCbCr: Digital Equations 

To convert 8-bit digital R G B' data with a 
16-235 nominal range (Studio RG B ) to 
YCbCr, the analog equations may be simplified 
to: 

Y = 0.213R' + 0.715G' + 0.072B' 

Cb = -0.117R' - 0.394G' + 0.511B' + 128 
Cr = 0.511R' - 0.464G' - 0.047B' + 128 

YCbCr to RGB: Analog Equations 

Many specifications assume the source is 
analog YPbPr. This is first converted to analog 
RGB': 

R' = Y+1.575Pr 

G' = Y - 0.468Pr - 0.187Pb 

B' = Y+1.856Fb 

To generate 8-bit RG B' with a 16-235 
nominal range (Studio RG B ), RGB' is then 
quantized to 8 bits: 

out' = round[219in' + 16] 

YCbCr to RGB: Digital Equations 

To convert 8-bit YCbCr to RG B' data with 
a 16-235 nominal range (Studio R'G'B'), the 
analog equations may be simplified to: 

R' = Y + 1.540(Cr - 128) 

G' = Y - 0.459 (Cr - 128) - 0.183(Cb - 128) 

B' = Y + 1.816(Cb - 128) 

YCbCr to RGB: General Considerations 

When performing YCbCr to RG B' con- 
version, the resulting RG B' values have a 
nominal range of 16-235, with possible occa- 
sional excursions into the 0-15 and 236-255 



values. This is due to Y and CbCr occasionally 
going outside the 16-235 and 16-240 ranges, 
respectively, due to video processing and 
noise. Note that 8-bit YCbCr and R'G'B' data 
should be saturated at the 0 and 255 levels to 
avoid underflow and overflow wrap-around 
problems. 

Table 3.2 lists the YCbCr values for 75% 
amplitude, 100% saturated color bars, a com- 
mon video test signal. 

Computer Systems Considerations 

If the RG B' data has a range of 0-255, as 
is commonly found in computer systems, the 
following equations may be more convenient 
to use: 

Y = 0.183R' + 0.614G' + 0.062B' + 16 

Cb = -0.101R' - 0.338G' + 0.439B' + 128 

Cr = 0.439R' - 0.399G' - 0.040B' + 128 

R' = 1.164(Y - 16) + 1.793(Cr- 128) 

G' = 1.164(Y- 16) - 0.534 (Cr - 128) - 
0.213 (Cb - 128) 

B' = 1.164(Y- 16) + 2.115(Cb - 128) 

Note that 8-bit YCbCr and RG B' data 
should be saturated at the 0 and 255 levels to 
avoid underflow and overflow wrap-around 
problems. 

4:4:4 YCbCr Format 

Figure 3.2 illustrates the positioning of 
YCbCr samples for the 4:4:4 format. Each sam- 
ple has a Y, a Cb, and a Cr value. Each sample 
is typically 8 bits (consumer applications) or 10 
bits (pro-video applications) per component. 
Each sample therefore requires 24 bits (or 30 
bits for pro-video applications) . 
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4:2:2 YCbCr Format 

Figure 3.3 illustrates the positioning of 
YCbCr samples for the 4:2:2 format. For every 
two horizontal Y samples, there is one Cb and 
Cr sample. Each sample is typically 8 bits (con- 
sumer applications) or 10 bits (pro-video appli- 
cations) per component. Each sample 
therefore requires 16 bits (or 20 bits for pro- 
video applications), usually formatted as 
shown in Figure 3.4. 

To display 4:2:2 YCbCr data, it is first con- 
verted to 4:4:4 YCbCr data, using interpolation 
to generate the missing Cb and Cr samples. 

4:1:1 YCbCr Format 

Figure 3.5 illustrates the positioning of 
YCbCr samples for the 4:1:1 format (also 
known as YUV12), used in some consumer 
video and DV video compression applications. 
For every four horizontal Y samples, there is 
one Cb and Cr value. Each component is typi- 
cally 8 bits. Each sample therefore requires 12 
bits, usually formatted as shown in Figure 3.6. 



ACTIVE X = FIELD 1 (576i FIELD 2) 
LINE [X] = FIELD 2 (576i FIELD 1) 

NUMBER 




O CB, CR SAMPLE 



• Y SAMPLE 

Figure 3.2. 4:4:4 Co-Sited Sampling. The 
sampling positions on the active scan lines 
of an interlaced picture. 



To display 4:1:1 YCbCr data, it is first con- 
verted to 4:4:4 YCbCr data, using interpolation 
to generate the missing Cb and Cr samples. 

4:2:0 YCbCr Format 

Rather than the horizontal-only 2:1 reduc- 
tion of Cb and Cr used by 4:2:2, 4:2:0 YCbCr 
implements a 2:1 reduction of Cb and Cr in 
both the vertical and horizontal directions. It is 
commonly used for video compression. 

As shown in Figures 3.7 through 3.11, 
there are several 4:2:0 sampling formats. Table 
3.3 lists the YCbCr formats for various DV 
applications. 

To display 4:2:0 YCbCr data, it is first con- 
verted to 4:4:4 YCbCr data, using interpolation 
to generate the new Cb and Cr samples. Note 
that some solutions do not properly convert 
the 4:2:0 YCbCr data to the 4:4:4 format, result- 
ing in a “chroma bug.” 



ACTIVE X = FIELD 1 (576i FIELD 2) 

LINE [X] = FIELD 2 (576i FIELD 1) 

NUMBER 

1 — ® • <§> • <§> • 

[ 1 ] — (§) ♦ (§) ♦ (§) •— 

2 — (§) ♦ ® • ( 8 ) • 

[2] — (§) ♦ (§) ♦ (§) •— 

3 — (§) • (§) • ® • 

O CB, CR SAMPLE 
• Y SAMPLE 

Figure 3.3. 4:2:2 Co-Sited Sampling. The 
sampling positions on the active scan lines 
of an interlaced picture. 
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Y4-0 


Y4- 1 
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Y3- 1 
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Y3-3 


Y3-4 


Y3-5 


Y2-0 


Y2- 1 


Y2-2 


Y2-3 


Y2-4 


Y2-5 


Y1 -0 


Y1 - 1 


Y1 -2 


Y1 -3 


Y1 -4 


Y1 -5 


YO-O 


Y0- 1 


YO-2 


YO-3 


YO-4 


YO-5 


CB7-0 


CR7-0 


CB7-2 


CR7-2 


CB7-4 


CR7-4 


CB6-0 


CR6-0 


CB6-2 


CR6-2 


CB6-4 


CR6-4 


CB5-0 


CR5-0 


CB5-2 


CR5-2 


CB5-4 


CR5-4 


CB4-0 


CR4-0 


CB4-2 


CR4-2 


CB4-4 


CR4-4 


CB3-0 


CR3-0 


CB3-2 


CR3-2 


CB3-4 


CR3-4 


CB2-0 


CR2-0 


CB2-2 


CR2-2 


CB2-4 


CR2-4 


CB1 -0 


CR1 -0 


CB1 -2 


CR1 -2 


CB1 -4 


CR1 -4 


CBO-O 


CRO-O 


CBO-2 


CRO-2 


CBO-4 


CRO-4 



-0 = SAMPLE 0 DATA 
- 1 = SAMPLE 1 DATA 
-2 = SAMPLE 2 DATA 
-3 = SAMPLE 3 DATA 
-4 = SAMPLE 4 DATA 



Figure 3.4. 4:2:2 Frame Buffer Formatting. 



ACTIVE X = FIELD 1 (576i FIELD 2) 
LINE [X] = FIELD 2 (576i FIELD 1) 

NUMBER 



A) 


• 


• 






— •— 


A) 


























— •— 


— •— 


— •— 




— •— 
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SAMPLE 


SAMPLE 


SAMPLE 


SAMPLE 


SAMPLE 
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Y7- 1 


Y7-2 


Y7-3 


Y7-4 


Y7-5 


Y6-0 


Y6- 1 


Y6-2 


Y6-3 


Y6-4 


Y6-5 


Y5-0 


Y5- 1 


Y5-2 


Y5-3 


Y5-4 


Y5-5 


Y4-0 


Y4- 1 


Y4-2 


Y4-3 


Y4-4 


Y4-5 


Y3-0 


Y3- 1 


Y3-2 


Y3-3 


Y3-4 


Y3-5 


Y2-0 


Y2- 1 


Y2-2 


Y2-3 


Y2-4 


Y2-5 


Y1 -0 


Y1 - 1 


Y1 -2 


Y1 -3 


Y1 -4 


Y1 -5 


YO-O 


Y0- 1 


YO-2 


YO-3 


YO-4 


YO-5 


CB7-0 


CB5-0 


CB3-0 


CB1 -0 


CB7-4 


CB5-4 


CB6-0 


CB4-0 


CB2-0 


CBO-O 


CB6-4 


CB4-4 


CR7-0 


CR5-0 


CR3-0 


CR1 -0 


CR7-4 


CR5-4 


CR6-0 


CR4-0 


CR2-0 


CRO-O 


CR6-4 


CR4-4 



O CB, CR SAMPLE 
• Y SAMPLE 



-0 = SAMPLE 0 DATA 
- 1 = SAMPLE 1 DATA 
-2 = SAMPLE 2 DATA 
-3 = SAMPLE 3 DATA 
-4 = SAMPLE 4 DATA 



Figure 3.5. 4:1:1 Co-Sited Sampling. The 
sampling positions on the active scan lines 
of an interlaced picture. 



Figure 3.6. 4:1:1 Frame Buffer Formatting, 
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ACTIVE 

LINE 

NUMBER 



o 


o 


o 




o 


o 


o 





O CALCULATED CB, CR SAMPLE 
• V SAMPLE 



ACTIVE 

LINE 

NUMBER 



o 


o 


O 




o 


6 


6 




o 


CALCULATED 


CB, CR SAMPLE 


• 


Y SAMPLE 





Figure 3.7. 4:2:0 Sampling for H.261, H.263, 
and MPEG-1. The sampling positions on the 
active scan lines of a progressive or 
noninterlaced picture. 



Figure 3.8. 4:2:0 Sampling for MPEG-2, 
MPEG-4.2, MPEG-4.10 (H.264), and SMPTE 
421M (VC-1). The sampling positions on the 
active scan lines of a progressive or 
noninterlaced picture. 
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50 Mbps 
DV 
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DV 
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Table 3.3. YCbCr Formats for Various DV Applications. 
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ACTIVE FIELD N FIELD N + 1 

LINE 
NUMBER 



o o o 

[ 1 ] 



o o o 

[ 2 ] • • • • • 

3 • • • • • • 

o o o 

[3] • • • • •- 



o o o 

[4] • • • • •- 



o CALCULATED CB, CR SAMPLE 
• Y SAMPLE 



Figure 3.9. 4:2:0 Sampling for MPEG-2, MPEG-4.2, MPEG-4.10 (H.264), and SMPTE 
421M (VC-1). The sampling positions on the active scan lines of an interlaced picture 
(top_field_first = 1). 



ACTIVE FIELD N FIELD N + 1 

LINE 
NUMBER 



o o o 

[ 1 ] • — • — • — • — • — • 



o o o 

[ 2 ] • • • • • • 

3 • • • • • 

o o o 

[3] • • • • • • 



o o o 

[4] • • • • • 



o CALCULATED CB, CR SAMPLE 
• Y SAMPLE 



Figure 3.10. 4:2:0 Sampling for MPEG-2, MPEG-4.2, MPEG-4.10 (H.264), and SMPTE 421M 
(VC-1). The sampling positions on the active scan lines of an interlaced picture (top_field_first 
= 0 ). 
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O CR SAMPLE 
□ CB SAMPLE 
• Y SAMPLE 

Figure 3.11. 4:2:0 Co-Sited Sampling for 576i DV and DVCAM. The sampling positions on 
the active scan lines of an interlaced picture. 



xvYCC Color Space 

The xvYCC (extended gamut YCbCr for 
video) color space extends the color gamut of 
normal YCbCr, enabling 1.8x more colors to be 
reproduced. The specification for xvYCC (IEC 
61966-2-4) uses BT.709 chromaticity and D65 
reference white. The equations for converting 
between scR'G'B' and xvYCbCr are the same 
as those used for converting between R G B' 
and YCbCr. 

xvYCC-based YCbCr data has an 8-bit 
range of 1-254, enabling backwards compati- 
bility with existing designs. Y has an 8-bit 
range of -15/219 to +238/219 (-0.068493 to 
+1.086758); CbCr has an 8-bit range of -15/224 
to +238/224 (-0.066964 to +1.062500). HDMI 
uses Gamut Boundary Description metadata to 
convey xvYCC video data is being used. 



PhotoYCC Color Space 

PhotoYCC (a trademark of Eastman 
Kodak Company) was developed to encode 
Photo CD image data. The goal was to develop 
a display-device-independent color space. For 
maximum video display efficiency, the color 
space is based upon ITU-R BT.601 and BT.709. 

The encoding process (RGB to PhotoYCC) 
assumes CIE Standard Illuminant D 65 and that 
the spectral sensitivities of the image capture 
system are proportional to the color-matching 
functions of the BT.709 reference primaries. 
The RGB values, unlike those for a computer 
graphics system, may be negative. PhotoYCC 
includes colors outside the BT.709 color 
gamut; these are encoded using negative val- 
ues. 
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RGB to PhotoYCC 

Linear RGB data (normalized to have val- 
ues of 0 to 1) is nonlinearly transformed to 
PhotoYCC as follows: 

for R, G, B > 0.0f8 

R' = 1.099 R 0 - 45 - 0.099 
G' = 1.099 G 0 - 45 - 0.099 
B' = 1.099 B 045 - 0.099 

for -0.018 <R, G, B< 0.018 
R' = 4.5 R 
G' = 4.5 G 
B' = 4.5 B 

for R, G, B < -0.018 

R' = - 1.099 |R| 0 - 45 - 0.099 
G' = - 1.099 |G| 0 - 45 - 0.099 
B' = - 1.099 |B| 0 - 45 - 0.099 

From R G B' with a 0-255 range, a luma 
and two chrominance signals (Cl and C2) are 
generated: 

Y = 0.213R' + 0.419G' + 0.081B' 

Cl = - 0.131R' - 0.256G' + 0.387B' + 156 

C2 = 0.373R' - 0.312G' - 0.061B' + 137 

As an example, a 20% gray value (R, G, and 
B = 0.2) would be recorded on the PhotoCD 
disc using the following values: 

Y = 79 

Cl = 156 

C2 = 137 



PhotoYCC to RGB 

Since PhotoYCC attempts to preserve the 
dynamic range of film, decoding PhotoYCC 
images requires the selection of a color space 
and range appropriate for the output device. 
Thus, the decoding equations are not always 
the exact inverse of the encoding equations. 
The following equations are suitable for gener- 
ating RGB values for driving a CRT display, 
and assume a unity relationship between the 
luma in the encoded image and the displayed 
image. 

R' = 0.981Y + 1.315(C2 - 137) 

G' = 0.981Y- 0.311 (Cl - 156) - 0.669(C2 - 137) 

B' = 0.981Y+ 1.601 (Cl -156) 

The R G B ' values should be saturated to a 
range of 0 to 255. The equations above assume 
the display uses phosphor chromaticities that 
are the same as the BT.709 reference prima- 
ries, and that the video signal luma (V) and the 
display luminance (L) have the relationship: 

for V> 0.0812 

L= ((V + 0.099) / 1.099) 1/0 - 45 

for V< 0.0812 
L = V / 4.5 

HSI, HLS, and HSV Color 
Spaces 

The HSI (hue, saturation, intensity) and 
HSV (hue, saturation, value) color spaces were 
developed to be more “intuitive” in manipulat- 
ing color and were designed to approximate 
the way humans perceive and interpret color. 
They were developed when colors had to be 
specified manually, and are rarely used now 
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that users can select colors visually or specify 
Pantone colors. These color spaces are dis- 
cussed for historic interest. HLS (hue, light- 
ness, saturation) is similar to HSI; the term 
lightness is used rather than intensity. 

The difference between HSI and HSV is 
the computation of the brightness component 
(I or V) , which determines the distribution and 
dynamic range of both the brightness (I or V) 
and saturation (S) . The HSI color space is best 
for traditional image processing functions such 
as convolution, equalization, histograms, and 
so on, which operate by manipulation of the 
brightness values since I is equally dependent 
on R, G, and B. The HSV color space is pre- 
ferred for manipulation of hue and saturation 
(to shift colors or adjust the amount of color) 
since it yields a greater dynamic range of satu- 
ration. 

Figure 3.12 illustrates the single hexcone 
HSV color model. The top of the hexcone cor- 
responds to V = 1, or the maximum intensity 
colors. The point at the base of the hexcone is 
black and here V = 0. Complementary colors 
are 180° opposite one another as measured by 
H, the angle around the vertical axis (V), with 
red at 0°. The value of S is a ratio, ranging from 
0 on the center line vertical axis (V) to 1 on the 
sides of the hexcone. Any value of S between 0 
and 1 may be associated with the point V = 0. 
The point S = 0, V = 1 is white. Intermediate 
values of V for S = 0 are the grays. Note that 
when S = 0, the value of H is irrelevant. From 
an artist’s viewpoint, any color with V = 1, S = 1 
is a pure pigment (whose color is defined by 
H) . Adding white corresponds to decreasing S 
(without changing V); adding black corre- 
sponds to decreasing V (without changing S) . 
Tones are created by decreasing both S and V. 
Table 3.4 lists the 75% amplitude, 100% satu- 
rated HSV color bars. 

Figure 3.13 illustrates the double hexcone 
HSI color model. The top of the hexcone corre- 



sponds to I = 1, or white. The point at the base 
of the hexcone is black and here 1 = 0. Comple- 
mentary colors are 180° opposite one another 
as measured by H, the angle around the verti- 
cal axis (I) , with red at 0° (for consistency with 
the HSV model, we have changed from the 
Tektronix convention of blue at 0°) . The value 
of S ranges from 0 on the vertical axis (I) to 1 
on the surfaces of the hexcone. The grays all 
have S = 0, but maximum saturation of hues is 
at S = 1, I = 0.5. Table 3.5 lists the 75% ampli- 
tude, 100% saturated HSI color bars. 



Chromaticity Diagram 

The color gamut perceived by a person 
with normal vision (the 1931 CIE Standard 
Observer) is shown in Figure 3.14. The dia- 
gram and underlying mathematics were 
updated in 1960 and 1976; however, the NTSC 
television system is based on the 1931 specifi- 
cations. 

Color perception was measured by viewing 
combinations of the three standard CIE (Inter- 
national Commission on Illumination or Com- 
mission Internationale de I’Eclairage) primary 
colors: red with a 700-nm wavelength, green at 
546.1 nm, and blue at 435.8 nm. These primary 
colors, and the other spectrally pure colors 
resulting from mixing of the primary colors, 
are located along the curved outer boundary 
line (called the spectrum locus), shown in Fig- 
ure 3.14. 

The ends of the spectrum locus (at red and 
blue) are connected by a straight line that rep- 
resents the purples, which are combinations of 
red and blue. The area within this closed 
boundary contains all the colors that can be 
generated by mixing light of different colors. 
The closer a color is to the boundary, the more 
saturated it is. Colors within the boundary are 
perceived as becoming more pastel as the cen- 
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Figure 3.12. Single Hexcone HSV Color Model. 





Nominal 

Range 
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Yellow 
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Black 


H 


0° to 360° 


- 


o 

o 


180° 


h- 1 

to 

o 

o 


o 

O 

o 

CO 


0° 


240° 


- 


S 
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0 


1 


1 


1 


1 


1 


1 


0 


V 


Oto 1 


0.75 


0.75 


0.75 


0.75 


0.75 


0.75 


0.75 


0 



Table 3.4. 75% HSV Color Bars. 
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Figure 3.13. Double Hexcone HSI Color Model. For consistency with the 
HSV model, we have changed from the Tektronix convention of blue at 0° 
and depict the model as a double hexcone rather than as a double cone. 





Nominal 

Range 


White 


Yellow 
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Magenta 
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- 
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120° 


300° 
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1 


0 


I 


0 to 1 


0.75 


0.375 


0.375 


0.375 


0.375 


0.375 


0.375 


0 



Table 3.5. 75% HSI Color Bars. For consistency with the HSV model, we have changed 
from the Tektronix convention of blue at 0°. 
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ter of the diagram (white) is approached. Each 
point on the diagram, representing a unique 
color, may be identified by its x and y coordi- 
nates. 

In the CIE system, the intensities of red, 
green, and blue are transformed into what are 
called the tristimulus values, which are repre- 
sented by the capital letters X, Y, and Z. These 
values represent the relative quantities of the 
primary colors. 

The coordinate axes of Figure 3.14 are 
derived from the tristimulus values: 

x = X/ (X + Y + Z) 

= red/ (red + green + blue) 




Figure 3.14. CIE 1931 Chromaticity Diagram 
Showing Various Color Regions. 



y = Y/(X + Y + Z) 

= green/ (red + green + blue) 

z = Z/(X + Y + Z) 

= blue/ (red + green + blue) 

The coordinates x, y, and z are called chro- 
maticity coordinates, and they always add up to 
1. As a result, z can always be expressed in 
terms of x and y, which means that only x and y 
are required to specify any color, and the dia- 
gram can be two-dimensional. 

Typically, a source or display specifies 
three (x, y ) coordinates to define the three pri- 
mary colors it uses. The triangle formed by the 
three ( x , y) coordinates encloses the gamut of 




Figure 3.15. CIE 1931 Chromaticity Diagram 
Showing Various Color Gamuts. 
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colors that the source or display can repro- 
duce. This is shown in Figure 3.15, which com- 
pares the color gamuts of NTSC, PAL and 
HDTV. Note that no set of three colors can 
generate all possible colors, which is why tele- 
vision pictures are never completely accurate. 

In addition, a source or display usually 
specifies the (x, y) coordinate of the white color 
used, since pure white is not usually captured 
or reproduced. White is defined as the color 
captured or produced when all three primary 
signals are equal, and it has a subtle shade of 
color to it. Note that luminance, or brightness, 
information is not included in the standard CIE 
1931 chromaticity diagram, but is an axis that 
is orthogonal to the ( x , y) plane. The lighter a 
color is, the more restricted the chromaticity 
range is. 

The RGB chromaticities and reference 
white (CIE illuminate C) for the 1953 NTSC 
standard are: 

R: x r = 0.67 y r = 0.33 

G: x g = 0.21 y g = 0.71 

B : xj> = 0.14 yj, = 0.08 

white: x w = 0.3101 y w = 0.3162 

Modern NTSC, 480i and 480p video sys- 
tems use a different set of RGB chromaticities 
(SMPTE “C”) and reference white (CIE illumi- 
nate Dgs) : 

R: x r = 0.630 y r = 0.340 

G: x g = 0.310 y g = 0.595 

B: xj, = 0.155 yj, = 0.070 

white: x w = 0.3127 y w = 0.3290 

The RGB chromaticities and reference 
white (CIE illuminate D 65 ) for PAL, SECAM, 
576i and 576p video systems are: 



R: x r = 0.64 


y r = 0.33 


G: Xg = 0.29 


y g = 0.60 


B: Xf, = 0.15 


y b = 0.06 


white: x w = 0.3127 


y w = 0.3290 



The RGB chromaticities and reference 
white (CIE illuminate D 65 ) for sRGB, scRGB, 
xvYCC, and HDTV are based on BT.709 and 
SMPTE 274M: 



R: 


x r = 0.64 


y r = 0.33 


G: 


Xg = 0.30 


y g = 0.60 


B: 


xj, = 0.15 


y b = 0.06 


white: 


x w = 0.3127 


y w = 0.3290 



Since different chromaticity and reference 
white values are used for various video stan- 
dards, minor color errors may occur when the 
source and display values do not match; for 
example, displaying a 480i or 480p program on 
an HDTV, or displaying an HDTV program on 
a NTSC television. These minor color errors 
can easily be corrected at the display by using 
a 3 x 3 matrix multiplier, as discussed in Chap- 
ter 7. 

The RGB chromaticities for consumer dis- 
plays are usually slightly different from the 
standards. As a result, one or more of the RGB 
colors are slightly off, such as having too much 
orange in the red, or too much blue in the 
green. This can usually be compensated by 
having the display professionally calibrated. 

Non-RGB Color Space 
Considerations 

When processing information in a non- 
RGB color space (such as YIQ, YUV, or 
YCbCr) , care must be taken that combinations 
of values are not created that result in the gen- 




Non-RGB Color Space Considerations 33 



ALL POSSIBLE 




R = RED 
G = GREEN 
B = BLUE 
Y = YELLOW 
C = CYAN 
M = MAGENTA 
W nr WHITE 
BK = BLACK 



Figure 3.16. RGB Limits Transformed into 3-D YCbCr Space. 



eration of invalid RGB colors. The term invalid 
refers to RGB components outside the normal- 
ized RGB limits of (1, 1, 1). 

For example, given that RGB has a normal- 
ized value of (1, 1, 1), the resulting YCbCr 
value is (235, 128, 128) . If Cb and Cr are manip- 
ulated to generate a YCbCr value of (235, 64, 
73) , the corresponding RGB normalized value 
becomes (0.6, 1.29, 0.56) — note that the green 
value exceeds the normalized value of 1. 

From this illustration it is obvious that 
there are many combinations of Y, Cb, and Cr 
that result in invalid RGB values; these YCbCr 
values must be processed so as to generate 
valid RGB values. Figure 3.16 shows the RGB 
normalized limits transformed into the YCbCr 
color space. 



Best results are obtained using a constant 
luma and constant hue approach — Y is not 
altered while Cb and Cr are limited to the max- 
imum valid values having the same hue as the 
invalid color prior to limiting. The constant hue 
principle corresponds to moving invalid CbCr 
combinations directly towards the CbCr origin 
(128, 128) , until they lie on the surface of the 
valid YCbCr color block. 

When converting to the RGB color space 
from a non-RGB color space, care must be 
taken to include saturation logic to ensure 
overflow and underflow wrap-around condi- 
tions do not occur due to the finite precision of 
digital circuitry. 8-bit RGB values less than 0 
must be set to 0, and values greater than 255 
must be set to 255. 
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Gamma Correction 

The transfer function of most CRT displays 
produces an intensity that is proportional to 
some power (referred to as gamma) of the sig- 
nal amplitude. As a result, high-intensity 
ranges are expanded and low-intensity ranges 
are compressed (see Figure 3.17). This is an 
advantage in combatting noise, as the eye is 
approximately equally sensitive to equally rela- 
tive intensity changes. By “gamma correcting” 
the video signals before transmission, the 
intensity output of the display is roughly linear 
(the gray line in Figure 3.17), and transmis- 
sion-induced noise is reduced. 

To minimize noise in the darker areas of 
the image, modern video systems limit the 
gain of the curve in the black region. This 
technique limits the gain close to black and 
stretches the remainder of the curve to main- 
tain function and tangent continuity. 

Although video standards assume a dis- 
play gamma of about 2.2, a gamma of about 2.5 
is more realistic for CRT displays. However, 
this difference improves the viewing in a dimly 
lit environment. More accurate viewing in a 
brightly lit environment may be accomplished 
by applying another gamma factor of about 
1.14 (2. 5/2.2). It is also common to tweak the 
gamma curve in the display to get closer to the 
“ fi lm look.” 

Early NTSC Systems 

Early NTSC systems assumed a simple 
transform at the display, with a gamma of 2.2. 
RGB values are normalized to have a range of 0 
to 1: 

R = R ' 2 - 2 
G = G ' 2 - 2 
B = B ' 2 - 2 



OUT 




Figure 3.17. Effect of Gamma. 

To compensate for the nonlinear display, 
linear RGB data was “gamma-corrected” prior 
to transmission by the inverse transform. RGB 
values are normalized to have a range of 0 to 1: 

R' = R 1/2 - 2 
G' = G 1/2 - 2 
B' = B 1/2 - 2 
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Early PAL and SECAM Systems 

Most early PAL and SECAM systems 
assumed a simple transform at the display, 
with a gamma of 2.8. RGB values are normal- 
ized to have a range of 0 to 1: 

R = R' 2 - 8 
G = G' 2 - 8 
B = B' 2 - 8 

To compensate for the nonlinear display, 
linear RGB data was “gamma-corrected” prior 
to transmission by the inverse transform. RGB 
values are normalized to have a range of 0 to 1: 

R' = R 1/2 ' 8 
G' = G 1/2 - 8 
B' = B 1/2 ' 8 

Current Systems 

Current NTSC, 480i, 480p, and HDTV 
video systems assume the following transform 
at the display, with a gamma of [1/0.45]. RGB 
values are normalized to have a range of 0 to 1: 

if (R\ G', B') < 0.081 
R = R' / 4.5 
G = G' / 4.5 
B = B' / 4.5 

if (R', G', B') > 0.081 

R=((R' + 0.099)/ 1.099) 1/045 
G = ((G' + 0.099) / 1.099) 1/0 45 
B=((B' + 0.099)/ 1.099) 1/045 



Extended gamut color spaces, such as 
scRGB, do additional processing for below- 
zero values: 

if (R', G', BO < -0.081 

R = -((R' - 0.099) / -1.099) 1/0 - 45 
G = — ((G' - 0.099) / -1.099) 1/0 45 
B = -((B' - 0.099) / -1.099) 1/a45 

if -0.081 < (R', G', B') < 0.081 
R= R' / 4.5 
G = G' / 4.5 
B = B' / 4.5 

To compensate for the nonlinear display, 
linear RGB data is “gamma-corrected” prior to 
transmission by the inverse transform. RGB 
values are normalized to have a range of 0 to 1: 

if (R, G, B) < 0.018 
R' = 4.5R 
G' = 4.5G 
B' = 4.5B 

for (R, G, B) > 0.018 
R' = 1.099R 045 - 0.099 
G' = 1.099G 045 - 0.099 
B' = 1.099B 045 - 0.099 

Extended gamut color spaces, such as 
scRGB, do additional processing for below- 
zero values: 

if (R, G, B) < -0.018 

R' = -1.099(-R 045 ) + 0.099 
G' = -1.099 (-G 045 ) + 0.099 
B' = -1.099 (-B 045 ) + 0.099 
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if -0.018 < (R, G, B) <0.018 
R' = 4.5R 
G' = 4.5G 
B' = 4.5B 

Although most PAL and SECAM standards 
specify a gamma of 2.8, a value of [1/0.45] is 
now commonly used. Thus, these equations 
are also now used for PAL, SECAM, 576i, and 
576p video systems. 

Non-CRT Displays 

Since they are not based on CRTs, the 
LCD, LCOS, DLP, and plasma displays have 
different display transforms. To simplify inter- 
facing to these displays, their electronics are 
designed to accept standard gamma-corrected 
video and then compensate for the actual trans- 
form of the display panel. 



Constant Luminance Problem 

Due to the wrong order of the gamma and 
matrix operations, the U and V (or Cb and Cr) 
signals also contribute to the luminance (Y) 
signal. This causes an error in the perceived 
luminance when the amplitude of U and V is 
not correct. This may be due to bandwidth-lim- 
iting U and V or a non-nominal setting of the U 
and Y gain (color saturation) . 

For low color frequencies, there is no prob- 
lem. For high color frequencies, U and Y disap- 
pear and consequently R', G', and B' degrade 
to be equal to (only) Y. 
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Chapter 4 



Video Signals 
Overview 



Video signals come in a wide variety of 
options — number of scan lines, interlaced vs. 
progressive, analog vs. digital, and so on. This 
chapter provides an overview of the common 
video signal formats and their timing. 

Digital Component Video 
Background 

In digital component video, the video sig- 
nals are in digital form (YCbCr or RGB), 
being encoded to composite NTSC, PAL, or 
SECAM only when it is necessary for broad- 
casting or recording purposes. 

The European Broadcasting Union (EBU) 
became interested in a standard for digital 
component video due to the difficulties of 
exchanging video material between the 576i 
PAL and SECAM systems. The format held the 
promise that the digital video signals would be 
identical whether sourced in a PAL or SECAM 
country, allowing subsequent encoding to the 
appropriate composite form for broadcasting. 
Consultations with the Society of Motion Pic- 



ture and Television Engineers (SMPTE) 
resulted in the development of an approach to 
support international program exchange, 
including 480i systems. 

A series of demonstrations was carried out 
to determine the quality and suitability for sig- 
nal processing of various methods. From these 
investigations, the main parameters of the digi- 
tal component coding, filtering, and timing 
were chosen and incorporated into ITU-R 
BT.601. BT.601 has since served as the starting 
point for other digital component video stan- 
dards. 

Coding Ranges 

The selection of the coding ranges bal- 
anced the requirements of adequate capacity 
for signals beyond the normal range and mini- 
mizing quantizing distortion. Although the 
black level of a video signal is reasonably well 
defined, the white level can be subject to varia- 
tions due to video signal and equipment toler- 
ances. Noise, gain variations, and transients 
produced by filtering can produce signal levels 
outside the nominal ranges. 
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8 or 10 bits per sample are used for each of 
the YCbCr or R G B' components. Although 8- 
bit coding introduces some quantizing distor- 
tion, it was originally felt that most video 
sources contained sufficient noise to mask 
most of the quantizing distortion. However, if 
the video source is virtually noise-free, the 
quantizing distortion is noticeable as contour- 
ing in areas where the signal brightness gradu- 
ally changes. In addition, at least two additional 
bits of fractional YCbCr or R G B' data were 
desirable to reduce rounding effects when 
transmitting between equipment in the studio 
editing environment. For these reasons, most 
pro-video equipment uses 10-bit YCbCr or 
R G B', allowing 2 bits of fractional YCbCr or 
R G B' data to be maintained. 

Initial proposals had equal coding ranges 
for all three YCbCr components. However, this 
was changed so that Y had a greater margin for 
overloads at the white levels, as white level lim- 
iting is more visible than black. Thus, the nom- 
inal 8-bit Y levels are 16-235, while the nominal 
8-bit CbCr levels are 16-240 (with 128 corre- 
sponding to no color). Occasional excursions 
into the other levels are permissible, but never 
at the 0 and 255 levels. 

For 8-bit systems, the values of 0x00 and 
OxFF are reserved for timing information. For 
10-bit systems, the values of 0x000-0x003 and 
0x3FC-0x3FF are reserved for timing informa- 
tion, to maintain compatibility with 8-bit sys- 
tems. 

The YCbCr or R G B' levels to generate 
75% color bars are discussed in Chapter 3. Dig- 
ital R G B' signals are defined to have the same 
nominal levels as Y to provide processing mar- 
gin and simplify the digital matrix conversions 
between R G B' and YCbCr. 



SDTV Sample Rate Selection 

Line-locked sampling of analog R G B' or 
YUV video signals is done. This technique pro- 
duces a static orthogonal sampling grid in 
which samples on the current scan line fall 
directly beneath those on previous scan lines 
and fields, as shown Figures 3.2 through 3.11. 

Another important feature is that the sam- 
pling is locked in phase so that one sample is 
coincident with the 50% amplitude point of the 
falling edge of analog horizontal sync (0x0). 
This ensures that different sources produce 
samples at nominally the same positions in the 
picture. Making this feature common simpli- 
fies conversion from one standard to another. 

For 480i and 576i video systems, several Y 
sampling frequencies were initially examined, 
including four times F sc . However, the four- 
times F sc sampling rates did not support the 
requirement of simplifying international 
exchange of programs, so they were dropped 
in favor of a single common sampling rate. 
Because the lowest sample rate possible (while 
still supporting quality video) was a goal, a 12 
MHz sample rate was preferred for a long 
time, but eventually was considered to be too 
close to the Nyquist limit, complicating the fil- 
tering requirements. When the frequencies 
between 12 MHz and 14.3 MHz were exam- 
ined, it became evident that a 13.5 MHz sample 
rate for Y provided some commonality 
between 480i and 576i systems. Cb and Cr, 
being color difference signals, do not require 
the same bandwidth as the Y, so may be sam- 
pled at one-half the Y sample rate, or 6.75 
MHz. 

The “4:2:2” notation now commonly used 
originally applied to NTSC and PAL video, 
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implying that Y, U and Y were sampled at 4x, 
2x, and 2x the color subcarrier frequency, 
respectively. The “4:2:2” notation was then 
adapted to BT.601 digital component video, 
implying that the sampling frequencies of Y, 
Cb and Cr were 4x, 2x, and 2x 3.375 MHz, 
respectively. “4:2:2” now commonly means that 
the sample rate of Cb and Cr is one-half that of 
Y, regardless of the actual sample rates used. 

With 13.5 MHz sampling, each scan line 
contains 858 samples (480i systems) or 864 
samples (576i systems) and consists of a digital 
blanking interval followed by an active line 
period. Both the 480i and 576i systems use 720 
samples during the active line period. Having a 
common number of samples for the active line 
period simplifies the design of multistandard 
equipment and standards conversion. With a 
sample rate of 6.75 MHz for Cb and Cr (4:2:2 
sampling) , each active line period contains 360 
Cr samples and 360 Cb samples. 

With analog systems, problems may arise 
with repeated processing, causing an exten- 
sion of the blanking intervals and softening of 
the blanking edges. Using 720 digital samples 
for the active line period accommodates the 
range of analog blanking tolerances of both the 
480i and 576i systems. Therefore, repeated 
processing may be done without affecting the 
digital blanking interval. Blanking to define the 
analog picture width need only be done once, 
preferably at the display or upon conversion to 
analog video. 

Initially, BT.601 supported only 480i and 
576i systems with a 4:3 aspect ratio (720 x 480i 
and 720 x 576i active resolutions). Support for 
a 16:9 aspect ratio was then added (960 x 480i 
and 960 x 576i active resolutions) using an 18 
MHz sample rate. 



EDTV Sample Rate Selection 

ITU BT.1358 defines the progressive 
SDTV video signals, also known as 480p or 
576p, or Enhanced Digital Television (EDTV). 
The sample rate is doubled to 27 MHz (4:3 
aspect ratio) or 36 MHz (16:9 aspect ratio) in 
order to keep the same static orthogonal sam- 
pling grid as that used by BT.601. 

HDTV Sample Rate Selection 

ITU BT.709 defines the 720p, 1080i, and 
1080p video signals, respectively. With HDTV, 
a different technique was used — the number of 
active samples per line and the number of 
active lines per frame is constant regardless of 
the frame rate. Thus, in order to keep a static 
orthogonal sampling grid, each frame rate 
uses a different sample clock rate. 



480i and 480p Systems 

Interlaced Analog Composite Video 

(M) NTSC and (M) PAL are analog com- 
posite video signals that carry all timing and 
color information within a single signal. These 
analog interfaces use 525 lines per frame and 
are discussed in detail in Chapter 8. 

Interlaced Analog Component Video 

Analog component signals are comprised 
of three signals, analog RGB' or YPbPr. 
Referred to as 480i (since there are typically 
480 active scan lines per frame and they are 
interlaced), the frame rate is usually 29.97 Hz 
(30/1.001) for compatibility with (M) NTSC 
timing. The analog interface uses 525 lines per 
frame, with active video present on lines 23- 
262 and 286-525, as shown in Figure 4.1. 
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Figure 4.1. 480i Vertical Interval Timing. 
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Figure 4.2. 480p Vertical Interval Timing, 
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For the 29.97 Hz frame rate, each scan line 
time (H) is about 63.556 ps. Detailed horizon- 
tal timing is dependent on the specific video 
interface used, as discussed in Chapter 5. 

Progressive Analog Component Video 

Analog component signals are comprised 
of three signals, analog RG B' or YPbPr. 
Referred to as 480p (since there are typically 
480 active scan lines per frame and they are 
progressive) , the frame rate is usually 59.94 Hz 
(60/1.001) for easier compatibility with (M) 
NTSC timing. The analog interface uses 525 
lines per frame, with active video present on 
lines 45-524, as shown in Figure 4.2. 

For the 59.94 Hz frame rate, each scan line 
time (H) is about 31.776 ps. Detailed horizon- 
tal timing is dependent on the specific video 
interface used, as discussed in Chapter 5. 



Interlaced Digital Component Video 

BT.601 and SMPTE 267M specify the rep- 
resentation for 480i digital RGB' or YCbCr 
video signals. Active resolutions defined within 
BT.601 and SMPTE 267M, their lx Y and 
RGB' sample rates (Fg), and frame rates, are: 

960 x 480i 18.0 MHz 29.97 Hz 

720 x 480i 13.5 MHz 29.97 Hz 

Other common active resolutions, their lx 
sample rates (F s ) , and frame rates, are: 



864 x 480i 


16.38 MHz 


29.97 Hz 


704 x 480i 


13.50 MHz 


29.97 Hz 


640 x 480i 


12.27 MHz 


29.97 Hz 


544 x 480i 


10.12 MHz 


29.97 Hz 


528 x 480i 


9.900 MHz 


29.97 Hz 


480 x 480i 


9.000 MHz 


29.97 Hz 


352 x 480i 


6.750 MHz 


29.97 Hz 




Figure 4.3. 480i Analog-Digital Relationship (4:3 Aspect Ratio, 29.97 Hz Frame Rate, 13.5 MHz 
Sample Clock). BT.601 specifies 16 samples for the front porch; CEA-861D (DVI and HDMI timing) 
specifies 19 samples for the front porch. 
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Figure 4.4. 480i Analog-Digital Relationship 

(16:9 Aspect Ratio, 29.97 Hz Frame Rate, 18 MHz Sample Clock). 




Figure 4.5. 480i Analog-Digital Relationship 

(4:3 Aspect Ratio, 29.97 Hz Frame Rate, 12.27 MHz Sample Clock). 
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Figure 4.6. 480i Analog-Digital Relationship 

(4:3 Aspect Ratio, 29.97 Hz Frame Rate, 10.125 MHz Sample Clock). 




Figure 4.7. 480i Analog-Digital Relationship 

(4:3 Aspect Ratio, 29.97 Hz Frame Rate, 9 MHz Sample Clock). 
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Figure 4.8. 480i Digital Vertical Timing (480 Active Lines). F and V change state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.3 through 4.7. 

These active lines are used by the SMPTE RP-202, ATSC A/54a, and ARIB STD-B32 standards. 
CEA-861D (DVI and HDMI timing) specifies lines 22-261 and 285-524 for active video. IEC 
61834-2, ITU-R BT.1618, and SMPTE 314M (DV formats) specify lines 23-262 and 285-524 for 
active video. 

ITU-R BT.656 specifies lines 20-263 and 283-525 for active video, resulting in 487 total active 
lines per frame. 



480i and 480p Systems 45 



864 x 480i is a 16:9 square pixel format, 
while 640 x 480i is a 4:3 square pixel format. 
Although the ideal 16:9 resolution is 854 x 480i, 
864 x 480i supports the MPEG 16 x 16 block 
structure. The 704 x 480i format is done by 
using the 720 x 480i format, and blanking the 
first eight and last eight samples each active 
scan line. Example relationships between the 
analog and digital signals are shown in Figures 
4.3 through 4.7. 

The H (horizontal blanking), V (vertical 
blanking), and F (field) signals are defined in 
Figure 4.8. The H, V, and F timing indicated is 
compatible with video compression standards 
rather than BT.656 discussed in Chapter 6. 

Progressive Digital Component Video 

BT.1358 and SMPTE 293M specify the rep- 
resentation for 480p digital RGB' or YCbCr 
video signals. Active resolutions defined within 



BT.1358 and SMPTE 293M, their lx sample 
rates (Fg), and frame rates, are: 

960 x 480p 36.0 MHz 59.94 Hz 

720 x 480p 27.0 MHz 59.94 Hz 

Other common active resolutions, their lx 
Y and RGB' sample rates (Fg), and frame 
rates, are: 



864 x 480p 


32.75 MHz 


59.94 Hz 


704 x 480p 


27.00 MHz 


59.94 Hz 


640 x 480p 


24.54 MHz 


59.94 Hz 


544 x 480p 


20.25 MHz 


59.94 Hz 


528 x 480p 


19.80 MHz 


59.94 Hz 


480 x 480p 


18.00 MHz 


59.94 Hz 


352 x 480p 


13.50 MHz 


59.94 Hz 


864 x 480p is a 


16:9 square pixel format, 



while 640 x 480p is a 4:3 square pixel format. 
Although the ideal 16:9 resolution is 854 x 
480p, 864 x 480p supports the MPEG 16 x 16 
block structure. The 704 x 480p format is done 




Figure 4.9. 480p Analog-Digital Relationship 

(4:3 Aspect Ratio, 59.94 Hz Frame Rate, 27 MHz Sample Clock). 
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Figure 4.10. 480p Analog-Digital Relationship 

(16:9 Aspect Ratio, 59.94 Hz Frame Rate, 36 MHz Sample Clock). 




Figure 4.11. 480p Analog-Digital Relationship 

(4:3 Aspect Ratio, 59.94 Hz Frame Rate, 24.54 MHz Sample Clock). 
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Figure 4.12. 480p Analog-Digital Relationship 

(4:3 Aspect Ratio, 59.94 Hz Frame Rate, 20.25 MHz Sample Clock). 
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Figure 4.13. 480p Digital Vertical Timing (480 Active Lines). V changes state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.9 through 4.12. 

These active lines are used by the SMPTE RP-202, ATSC A/54, and ARIB STD-B32 standards. 
However, CEA-861 (DVI and HDMI timing) specifies lines 43-522 for active video. 
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by using the 720 x 480p format, and blanking 
the first eight and last eight samples each 
active scan line. Example relationships 
between the analog and digital signals are 
shown in Figures 4.9 through 4.12. 

The H (horizontal blanking), V (vertical 
blanking), and F (field) signals are defined in 
Figure 4.13. The H, V, and F timing indicated is 
compatible with video compression standards 
rather than BT.656 discussed in Chapter 6. 

SIF and QSIF 

SIF is defined to have an active resolution 
of 352 x 240p. Square pixel SIF is defined to 
have an active resolution of 320 x 240p. 

QSIF is defined to have an active resolu- 
tion of 176 x 120p. Square pixel QSIF is defined 
to have an active resolution of 160 x 120p. 



576i and 576p Systems 

Interlaced Analog Composite Video 

(B, D, G, H, I, N, N c ) PAT are analog com- 
posite video signals that carry all timing and 
color information within a single signal. These 
analog interfaces use 625 lines per frame and 
are discussed in detail in Chapter 8. 

Interlaced Analog Component Video 

Analog component signals are comprised 
of three signals, analog RG B' or YPbPr. 
Referred to as 576i (since there are typically 
576 active scan lines per frame and they are 
interlaced) , the frame rate is usually 25 Hz for 
compatibility with PAL timing. The analog 
interface uses 625 lines per frame, with active 
video present on lines 23-310 and 336-623, as 
shown in Figure 4.14. 



For the 25 Hz frame rate, each scan line 
time (H) is 64 ps. Detailed horizontal timing is 
dependent on the specific video interface used, 
as discussed in Chapter 5. 

Progressive Analog Component Video 

Analog component signals are comprised 
of three signals, analog RG B' or YPbPr. 
Referred to as 576p (since there are typically 
576 active scan lines per frame and they are 
progressive), the frame rate is usually 50 Hz 
for compatibility with PAL timing. The analog 
interface uses 625 lines per frame, with active 
video present on lines 45-620, as shown in Fig- 
ure 4.15. 

For the 50 Hz frame rate, each scan line 
time (H) is 32 ps. Detailed horizontal timing is 
dependent on the specific video interface used, 
as discussed in Chapter 5. 

Interlaced Digital Component Video 

BT.601 specifies the representation for 
576i digital RG B' or YCbCr video signals. 
Active resolutions defined within BT.601, their 
lx Y and RGB' sample rates (F s ), and frame 
rates, are: 

960 x 576i 18.0 MHz 25 Hz 

720 x 576i 13.5 MHz 25 Hz 

Other common active resolutions, their lx 
Y and RG B' sample rates (FQ, and frame 
rates, are: 



1024 x 576i 


19.67 MHz 


25 Hz 


768 x 576i 


14.75 MHz 


25 Hz 


704 x 576i 


13.50 MHz 


25 Hz 


544 x 5761 


10.12 MHz 


25 Hz 


480 x 5761 


9.000 MHz 


25 Hz 
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Figure 4.14. 576i Vertical Interval Timing. 




Figure 4.15. 576p Vertical Interval Timing. 
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Figure 4.16. 576i Analog-Digital Relationship 

(4:3 Aspect Ratio, 25 Hz Frame Rate, 13.5 MHz Sample Clock). 




Figure 4.17. 576i Analog-Digital Relationship 

(16:9 Aspect Ratio, 25 Hz Frame Rate, 18 MHz Sample Clock). 
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Figure 4.18. 576i Analog-Digital Relationship 

(4:3 Aspect Ratio, 25 Hz Frame Rate, 14.75 MHz Sample Clock). 




Figure 4.19. 576i Analog-Digital Relationship 

(4:3 Aspect Ratio, 25 Hz Frame Rate, 10.125 MHz Sample Clock). 
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V 


1-22 


0 


1 
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0 


0 


311-312 


0 


1 


313-335 


1 


1 


336-623 


1 


0 


624-625 


1 
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Figure 4.20. 576i Digital Vertical Timing (576 Active Lines). F and V change state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.16 through 4.19. 

IEC 61834-2, ITU-R BT.1618, and SMPTE 314M (DV formats) specify lines 23-310 and 335-622 
for active video. 
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1024 x 576i is a 16:9 square pixel format, 
while 768 x 576i is a 4:3 square pixel format. 
The 704 x 576i format is done by using the 720 
x 576i format, and blanking the first eight and 
last eight samples each active scan line. Exam- 
ple relationships between the analog and digi- 
tal signals are shown in Figures 4.16 through 
4.19. 

The H (horizontal blanking), V (vertical 
blanking), and F (field) signals are defined in 
Figure 4.20. The H, V, and F timing indicated is 
compatible with video compression standards 
rather than BT.656 discussed in Chapter 6. 

Progressive Digital Component Video 

BT.1358 specifies the representation for 
576p digital R G B' or YCbCr signals. Active 
resolutions defined within BT.1358, their lx Y 
and R G B' sample rates (Fg), and frame rates, 
are: 



960 x 576p 36.0 MHz 50 Hz 

720 x 576p 27.0 MHz 50 Hz 

Other common active resolutions, their lx 
Y and RG B' sample rates (Fg), and frame 
rates, are: 



1024 x 576p 


39.33 MHz 


50 Hz 


768 x 576p 


29.50 MHz 


50 Hz 


704 x 576p 


27.00 MHz 


50 Hz 


544 x 576p 


20.25 MHz 


50 Hz 


480 x 576p 


18.00 MHz 


50 Hz 



1024 x 576p is a 16:9 square pixel format, 
while 768 x 576p is a 4:3 square pixel format. 
The 704 x 576p format is done by using the 720 
x 576p format, and blanking the first eight and 
last eight samples each active scan line. Exam- 
ple relationships between the analog and digi- 
tal signals are shown in Figures 4.21 through 
4.24. 




Figure 4.21. 576p Analog-Digital Relationship 

(4:3 Aspect Ratio, 50 Hz Frame Rate, 27 MHz Sample Clock). 
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Figure 4.22. 576p Analog-Digital Relationship 

(16:9 Aspect Ratio, 50 Hz Frame Rate, 36 MHz Sample Clock). 




Figure 4.23. 576p Analog-Digital Relationship 

(4:3 Aspect Ratio, 50 Hz Frame Rate, 29.5 MHz Sample Clock). 
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Figure 4.24. 576p Analog-Digital Relationship 

(4:3 Aspect Ratio, 50 Hz Frame Rate, 20.25 MHz Sample Clock). 
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Figure 4.25. 576p Digital Vertical Timing (576 Active Lines). V changes state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.21 through 4.24. 
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The H (horizontal blanking), V (vertical 
blanking), and F (field) signals are defined in 
Figure 4.25. The H, V, and F timing indicated is 
compatible with video compression standards 
rather than BT.656 discussed in Chapter 6. 



720p Systems 

Progressive Analog Component Video 

Analog component signals are comprised 
of three signals, analog RGB' or YPbPr. 
Referred to as 720p (since there are typically 
720 active scan lines per frame and they are 
progressive) , the frame rate is usually 59.94 Hz 
(60/1.001) to simplify the generation of (M) 
NTSC video. The analog interface uses 750 
lines per frame, with active video present on 
lines 26-745, as shown in Figure 4.26. 

For the 59.94 Hz frame rate, each scan line 
time (H) is about 22.24 ps. Detailed horizontal 
timing is dependent on the specific video inter- 
face used, as discussed in Chapter 5. 



Progressive Digital Component Video 

SMPTE 296M specifies the representation 
for 720p digital RG B' or YCbCr signals. 
Active resolutions defined within SMPTE 
296M, their lx Y and R'G'B' sample rates (F s ), 
and frame rates, are: 



1280 x 720p 


74.176 MHz 


23.976 Hz 


1280 x 720p 


74.250 MHz 


24.000 Hz 


1280 x 720p 


74.250 MHz 


25.000 Hz 


1280 x 720p 


74.176 MHz 


29.970 Hz 


1280 x 720p 


74.250 MHz 


30.000 Hz 


1280 x 720p 


74.250 MHz 


50.000 Hz 


1280 x 720p 


74.176 MHz 


59.940 Hz 


1280 x 720p 


74.250 MHz 


60.000 Hz 



Note that square pixels and a 16:9 aspect 
ratio are used. Example relationships between 
the analog and digital signals are shown in Fig- 
ures 4.27 and 4.28, and Table 4.1. The H (hori- 
zontal blanking), V (vertical blanking), and F 
(field) signals are as defined in Figure 4.29. 



START 

OF 




Figure 4.26. 720p Vertical Interval Timing. 
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SAMPLE RATE = 74.176 OR 74.25 MHZ 




DIGITAL 

BLANKING DIGITAL ACTIVE LINE 



370 SAMPLES 1280 SAMPLES 

(0-369) (370-1649) 

TOTAL LINE 



1650 SAMPLES 
(0-1649) 



Figure 4.27. 720p Analog-Digital Relationship (16:9 Aspect 
Ratio, 59.94 Hz Frame Rate, 74.176 MHz Sample Clock and 60 
Hz Frame Rate, 74.25 MHz Sample Clock). 
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TOTAL LINE 



[A] SAMPLES 



Figure 4.28. General 720p Analog-Digital Relationship. 
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Active 

Horizontal 

Samples 


Frame 

Rate 

(Hz) 


lx Y 
Sample 
Rate 
(MHz) 


Total 

Horizontal 

Samples 

(A) 


Horizontal 

Blanking 

Samples 

(B) 


C 

Samples 


1280 


24/1.001 


74.25/1.001 


4125 


2845 


2585 


24 


74.25 


4125 


2845 


2585 


25 1 


48 


1536 


256 


21 


25 1 


49.5 


1584 


304 


25 


25 


74.25 


3960 


2680 


2420 


30/1.001 


74.25/1.001 


3300 


2020 


1760 


30 


74.25 


3300 


2020 


1760 


50 


74.25 


1980 


700 


440 


60/1.001 


74.25/1.001 


1650 


370 


110 


60 


74.25 


1650 


370 


110 



Note : 

1. Useful for CRT-based 50 Hz HDTVs based on a 31.250 kHz horizontal fre- 
quency. Sync pulses are -300 mV bi-level, rather than ±300 mV tri-level. 720p 
content scaled vertically to 1152i active scan lines; 1250i total scan lines 
instead of 750p. 

Table 4.1. Various 720p Analog-Digital Parameters for Figure 4.28. 
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Figure 4.29. 720p Digital Vertical Timing (720 Active Lines). V changes state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.27 and 4.28. 
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1080i and 1080p Systems 

Interlaced Analog Component Video 

Analog component signals are comprised 
of three signals, analog RG B' or YPbPr. 
Referred to as 1080i (since there are typically 
1080 active scan lines per frame and they are 
interlaced), the frame rate is usually 25 or 
29.97 Hz (30/1.001) to simplify the generation 
of (B, D, G, H, I) PAL or (M) NTSC video. The 
analog interface uses 1125 lines per frame, 
with active video present on lines 21-560 and 
584-1123, as shown in Figure 4.30. 

MPEG-2 and MPEG-4 systems use 1088 
lines, rather than 1080, in order to have a multi- 
ple of 32 scan lines per frame. In this case, an 
additional 4 lines per field after the active video 
are used. 

For the 25 Hz frame rate, each scan line 
time is about 35.56 ps. For the 29.97 Hz frame 
rate, each scan line time is about 29.66 ps. 
Detailed horizontal timing is dependent on the 
specific video interface used, as discussed in 
Chapter 5. 

1152i Format 

The 1152i active (1250 total) line format is 
not a broadcast transmission format. However, 
it is being used as an analog interconnection 
standard from HD set-top boxes and DVD play- 
ers to 50 Hz CRT-based HDTVs. This enables 
50 Hz HDTVs to use a fixed 31.25 kHz horizon- 
tal frequency, reducing their cost. Other 
HDTV display technologies, such as DLP, 
LCD, and plasma, are capable of handling the 
native timing of 720p50 (750p50 with VBI) and 
1080i25 (1125125 with VBI) analog signals. 



The set-top box or DVD player converts 
720p50 and 1080i25 content to the 1152i25 for- 
mat. 1280 x 720p50 content is scaled to 1280 x 
1152125; 1920 x 1080i25 content is presented 
letterboxed in a 1920 x 1152i25 format. HDTVs 
will have a nominal vertical zoom mode for cor- 
recting the geometry of 1080i25, which can be 
recognized by the vertical synchronizing sig- 
nal. 

Progressive Analog Component Video 

Analog component signals are comprised 
of three signals, analog RG B' or YPbPr. 
Referred to as 1080p (since there are typically 
1080 active scan lines per frame and they are 
progressive), the frame rate is usually 50 or 
59.94 Hz (60/1.001) to simplify the generation 
of (B, D, G, H, I) PAL or (M) NTSC video. The 
analog interface uses 1125 lines per frame, 
with active video present on lines 42-1121, as 
shown in Figure 4.31. 

MPEG-2 and MPEG-4 systems use 1088 
lines, rather than 1080, in order to have a multi- 
ple of 16 scan lines per frame. In this case, an 
additional 8 lines per frame after the active 
video are used. 

For the 50 Hz frame rate, each scan line 
time is about 17.78 ps. For the 59.94 Hz frame 
rate, each scan line time is about 14.83 ps. 
Detailed horizontal timing is dependent on the 
specific video interface used, as discussed in 
Chapter 5. 
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Figure 4.30. 1080i Vertical Interval Timing. 
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Figure 4.31. 1080p Vertical Interval Timing. 
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SAMPLE RATE = 74.25 OR 74.176 MHZ 





DIGITAL ACTIVE LINE 



280 SAMPLES 
(0-279) 



1920 SAMPLES 
(280-2199) 



2200 SAMPLES 
(0-2199) 



Figure 4.32. 1080i Analog-Digital Relationship (16:9 Aspect 
Ratio, 29.97 Hz Frame Rate, 74.176 MHz Sample Clock and 30 
Hz Frame Rate, 74.25 MHz Sample Clock). 
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Figure 4.33. General 1080i Analog-Digital Relationship. 
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Active 

Horizontal 

Samples 

(A) 


Frame 

Rate 

(Hz) 


lx Y 
Sample 
Rate 
(MHz) 


Total 

Horizontal 

Samples 

(B) 


Horizontal 

Blanking 

Samples 

(C) 


D 

Samples 


1920 


25 1 


72 


2304 


384 


32 


25 1 


74.25 


2376 


456 


38 


25 


74.25 


2640 


720 


528 


30/1.001 


74.25/1.001 


2200 


280 


88 


30 


74.25 


2200 


280 


88 


1440 


25 1 


54 


1728 


288 


24 


25 


55.6875 


1980 


540 


396 


30/1.001 


55.6875/1.001 


1650 


210 


66 


30 


55.6875 


1650 


210 


66 


1280 


25 1 


48 


1536 


256 


21 


25 


49.5 


1760 


480 


352 


30/1.001 


49.5/1.001 


1466.7 


186.7 


58.7 


30 


49.5 


1466.7 


186.7 


58.7 



Notes : 

1. Useful for CRT-based 50 Hz HDTVs based on a 31.250 kHz horizontal fre- 
quency. Sync pulses are -300 mV bi-level, rather than ±300 mV tri-level. 1080i 
content letterboxed in 1152i active scan lines; 1250i total scan lines instead of 
11251. 

Table 4.2. Various 1080i Analog-Digital Parameters for Figure 4.33. 



Interlaced Digital Component Video 

ITU-R BT.709 and SMPTE 274M specify 
the digital component format for the 1080i digi- 
tal R G B' or YCbCr signal. Active resolutions 
defined within BT.709 and SMPTE 274M, their 
lx Y and RGB' sample rates (Fg), and frame 
rates, are: 

1920 x 1080i 74.250 MHz 25.00 Hz 

1920 x 1080i 74.176 MHz 29.97 Hz 

1920 x 1080i 74.250 MHz 30.00 Hz 

Note that square pixels and a 16:9 aspect 
ratio are used. Other common active resolu- 
tions, their lx Y and R G B' sample rates (Fg), 
and frame rates, are: 



1280 x 1080i 


49.500 MHz 


25.00 Hz 


1280 x 1080i 


49.451 MHz 


29.97 Hz 


1280 x 1080i 


49.500 MHz 


30.00 Hz 


1440 x 1080i 


55.688 MHz 


25.00 Hz 


1440 x 1080i 


55.632 MHz 


29.97 Hz 


1440 x 1080i 


55.688 MHz 


30.00 Hz 



Example relationships between the analog 
and digital signals are shown in Figures 4.32 
and 4.33, and Table 4.2. The H (horizontal 
blanking) and V (vertical blanking) signals are 
as defined in Figure 4.34. 
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Figure 4.34. 1080i Digital Vertical Timing (1080 Active Lines). F and V change state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.32 and 4.33. 
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Progressive Digital Component Video 

ITU-R BT.709 and SMPTE 274M specify 
the digital component format for the 1080p dig- 
ital R'G'B' or YCbCr signal. Active resolutions 
defined within BT.709 and SMPTE 274M, their 
lx Y and R'G'B' sample rates (Eg), and frame 
rates, are: 



1920 


X 


1080p 


74.176 MHz 


23.976 Hz 


1920 


X 


1080p 


74.250 MHz 


24.000 Hz 


1920 


X 


1080p 


74.250 MHz 


25.000 Hz 


1920 


X 


1080p 


74.176 MHz 


29.970 Hz 


1920 


X 


1080p 


74.250 MHz 


30.000 Hz 


1920 


X 


1080p 


148.50 MHz 


50.000 Hz 


1920 


X 


1080p 


148.35 MHz 


59.940 Hz 


1920 


X 


1080p 


148.50 MHz 


60.000 Hz 



Note that square pixels and a 16:9 aspect 
ratio are used. Other common active resolu- 
tions, their lx Y and R G B' sample rates (Eg), 



and frame rates, are: 






1280 x 1080p 


49.451 MHz 


23.976 Hz 


1280 x 1080p 


49.500 MHz 


24.000 Hz 


1280 x 1080p 


49.500 MHz 


25.000 Hz 


1280 x 1080p 


49.451 MHz 


29.970 Hz 


1280 x 1080p 


49.500 MHz 


30.000 Hz 


1280 x 1080p 


99.000 MHz 


50.000 Hz 


1280 x 1080p 


98.901 MHz 


59.940 Hz 


1280 x 1080p 


99.000 MHz 


60.000 Hz 


1440 x 1080p 


55.632 MHz 


23.976 Hz 


1440 x 1080p 


55.688 MHz 


24.000 Hz 


1440 x 1080p 


55.688 MHz 


25.000 Hz 


1440 x 1080p 


55.632 MHz 


29.970 Hz 


1440 x 1080p 


55.688 MHz 


30.000 Hz 


1440 x 1080p 


111.38 MHz 


50.000 Hz 


1440 x 1080p 


111.26 MHz 


59.940 Hz 


1440 x 1080p 


111.38 MHz 


60.000 Hz 



Example relationships between the analog 
and digital signals are shown in Figures 4.35 
and 4.36, and Table 4.3. The H (horizontal 
blanking) , V (vertical blanking) , and F (field) 
signals are as defined in Figure 4.37. 



Other Video Systems 

Some consumer displays, such as those 
based on LCD and plasma technologies, have 
adapted other resolutions as their native reso- 
lution. Common active resolutions and their 
names are: 



640 x 400 


VGA 


640 x 480 


VGA 


854 x 480 


WVGA 


800 x 600 


SVGA 


1024 x 768 


XGA 


1280 x 768 


WXGA 


1366 x 768 


WXGA 


1024 x 1024 


XGA 


1280 x 1024 


SXGA 


1600 x 1024 


WSXGA 


1600 x 1200 


UXGA 


1920 x 1200 


WUXGA 



These resolutions, and their timings, are 
defined for computer monitors by the Video 
Electronics Standards Association (VESA). 
Displays based on one of these native resolu- 
tions are usually capable of accepting many 
input resolutions, scaling the source to match 
the display resolution. 
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SAMPLE RATE = 148.5 OR 148.35 MHZ 




280 SAMPLES 
(0-279) 
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1920 SAMPLES 
(280-2199) 



2200 SAMPLES 
(0-2199) 



Figure 4.35. 1080p Analog-Digital Relationship (16:9 Aspect 
Ratio, 59.94 Hz Frame Rate, 148.35 MHz Sample Clock and 60 
Hz Frame Rate, 148.5 MHz Sample Clock). 
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Figure 4.36. General 1080p Analog-Digital Relationship. 
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Active 

Horizontal 

Samples 

(A) 


Frame 

Rate 

(Hz) 


lx Y 
Sample 
Rate 
(MHz) 


Total 

Horizontal 

Samples 

(B) 


Horizontal 

Blanking 

Samples 

(C) 


D 

Samples 


1920 


24/1.001 


74.25/1.001 


2750 


830 


638 


24 


74.25 


2750 


830 


638 


25 


74.25 


2640 


720 


528 


30/1.001 


74.25/1.001 


2200 


280 


88 


30 


74.25 


2200 


280 


88 


50 


148.5 


2640 


720 


528 


60/1.001 


148.5/1.001 


2200 


280 


88 


60 


148.5 


2200 


280 


88 


1440 


24/1.001 


55.6875/1.001 


2062.5 


622.5 


478.5 


24 


55.6875 


2062.5 


622.5 


478.5 


25 


55.6875 


1980 


540 


396 


30/1.001 


55.6875/1.001 


1650 


210 


66 


30 


55.6875 


1650 


210 


66 


50 


111.375 


1980 


540 


396 


60/1.001 


111.375/1.001 


1650 


210 


66 


60 


111.375 


1650 


210 


66 


1280 


24/1.001 


49.5/1.001 


1833.3 


553.3 


425.3 


24 


49.5 


1833.3 


553.3 


425.3 


25 


49.5 


1760 


480 


352 


30/1.001 


49.5/1.001 


1466.7 


186.7 


58.7 


30 


49.5 


1466.7 


186.7 


58.7 


50 


99 


1760 


480 


352 


60/1.001 


99/1.001 


1466.7 


186.7 


58.7 


60 


99 


1466.7 


186.7 


58.7 



Table 4.3. Various 1080p Analog-Digital Parameters for 
Figure 4.36. 
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Figure 4.37. 1080p Digital Vertical Timing (1080 Active Lines). V changes state at the EAV 
sequence at the beginning of the digital line. Note that the digital line number changes state prior 
to the start of horizontal sync, as shown in Figures 4.35 and 4.36. 
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Chapter 5 



Analog Video 
Interfaces 



For years, the primary video signal used 
by the consumer market has been composite 
NTSC or PAL video (see Figures 8.2 and 8.13). 
Attempts have been made to support S-video, 
but, until recently, it has been largely limited to 
S-VHS VCRs and high-end televisions. 

With the introduction of DVD players, digi- 
tal set-top boxes, and DTV, there has been 
renewed interest in providing high-quality 
video to the consumer market. This equipment 
not only supports very high-quality composite 
and S-video signals, but many devices also 
allow the option of using analog RGB' or 
YPbPr video. 

Using analog R G B' or YPbPr video elimi- 
nates NTSC/PAL encoding and decoding arti- 
facts. As a result, the picture is sharper and has 
less noise. More color bandwidth is also avail- 
able, increasing the horizontal detail. 



S-Video Interface 

The RCA phono connector (consumer 
market) or BNC connector (pro-video market) 
transfers a composite NTSC or PAL video sig- 
nal, made by adding the intensity (Y) and color 
(C) video signals together. The television then 
has to separate these Y and C video signals in 
order to display the picture. The problem is 
that the Y/ C separation process is never per- 
fect, as discussed in Chapter 9. 

Many video components now support a 4- 
pin “SI” S-video connector, illustrated in Fig- 
ure 5.1 (the female connector viewpoint). This 
connector keeps the intensity (Y) and color 
(C) video signals separate, eliminating the Y/ C 
separation process in the TV. As a result, the 
picture is sharper and has less noise. Figures 
9.2 and 9.3 illustrate the Y signal, and Figures 
9.10 and 9.11 illustrate the C signal. 

NTSC and PAL VBI (vertical blanking 
interval) data, discussed in Chapter 8, may be 
present on the 480i or 576i Y video signal. 
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The “S2” version adds a +5V DC offset to 
the C signal when a widescreen (16:9) anamor- 
phic program (horizontally squeezed by 25%) 
is present. A 16:9 TV detects the DC offset and 
horizontally expands the 4:3 image to fill the 
screen, restoring the correct aspect ratio of the 
program. The “S3” version also supports using 
a +2.3V offset when a program is letterboxed. 

The IEC 60933-5 standard specifies the S- 
video connector, including signal levels. 

Extended S-Video Interface 

The PC market also uses an extended S- 
Video interface. This interface has 7 pins, as 
shown in Figure 5.1, and is backwards compat- 
ible with the 4-pin interface. 

The use of the three additional pins varies 
by manufacturer. They may be used to support 
an I 2 C interface (SDA bi-directional data pin 
and SCL clock pin), +12V power, a composite 
NTSC/PAL video signal (CVBS), or analog 
RGB' or YPbPr video signals. 



SCART Interface 

Most consumer video components in 
Europe support one or two 21-pin SCART con- 
nectors (also known as Peritel, Peritelevision, 
and Euroconnector). This connection allows 
analog RGB' video or S-video, composite 
video, and analog stereo audio to be transmit- 
ted between equipment using a single cable. 
The composite video signal must always be 
present, as it provides the basic video timing 
for the analog R'G'B' video signals. Note that 
the 700 mV R'G'B' signals do not have a blank- 
ing pedestal or sync information, as illustrated 
in Figure 5.4. 

PAL VBI (vertical blanking interval) data, 
discussed in Chapter 8, may be present on the 
576i composite video signal. 

There are now several types of SCART 
pinouts, depending on the specific functions 
implemented, as shown in Tables 5.1 through 
5.3. Pinout details are shown in Figure 5.2. 

The CENELEC EN 50049-1 and IEC 
60933 standards specify the basic SCART con- 
nector, including signal levels. 



7-PIN MINI DIN CONNECTOR 




1,2 = GND 

3 = Y 

4 = C 

5 = CVBS /SCL (SERIAL CLOCK) 

6 = GND /SDA (SERIAL DATA) 

7 = NC/+12V 



4-PIN MINI DIN CONNECTOR 




1, 2 = GND 

3 = Y 

4 = C 



1 



11 



13 15 17 19 21 



□ □□□□□□□□□a 

D D D D D D D D D D 



2 4 6 8 10 12 14 16 18 20 



Figure 5.1. S-Video Connector and Signal Names. 



Figure 5.2. SCART Connector. 
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Pin 


Function 


Signal Level 


Impedance 


1 


right audio out 


0.5V rms 


< IK ohm 


2 


right audio in 


0.5V rms 


> 10K ohm 


3 


left / mono audio out 


0.5V rms 


< IK ohm 


4 


ground - for pins 1, 2, 3, 6 






5 


ground - for pin 7 






6 


left / mono audio in 


0.5V rms 


> 10K ohm 


7 


blue (or C) video in / out 


0.7V (or 0.3V burst) 


75 ohms 


8 


status and aspect ratio in / out 


9.5V-12V = 4:3 source 
4.5V-7V = 16:9 source 


> 10K ohm 






0V-2V = inactive source 




9 


ground - for pin 11 






10 


data 2 






11 


green video in / out 


0.7V 


75 ohms 


12 


data 1 






13 


ground - for pin 15 






14 


ground - for pin 16 






15 


red (or C) video in / out 


0.7V (or 0.3V burst) 


75 ohms 


16 


RGB control in / out 


1-3V = RGB, 
0-0.4V = composite 


75 ohms 


17 


ground - for pin 19 






18 


ground - for pin 20 






19 


composite (or Y) video out 


IV 


75 ohms 


20 


composite (or Y) video in 


IV 


75 ohms 


21 


ground - for pins 8, 10, 12, shield 







Note : 

Often, the SCART 1 connector supports composite video and RGB, the SCART 2 
connector supports composite video and S-Video, and the SCART 3 connector supports 
only composite video. SCART connections may also be used to add external decoders 
or descramblers to the video path, the video signal goes out and comes back in. 

The RGB control signal controls the TV switch between the composite and RGB 
inputs, enabling the overlaying of text onto the video, even the internal TV program. 
This enables an external teletext or closed captioning decoder to add information over 
the current program. If pin 16 is held high, signifying RGB signals are present, the 
sync is still carried on the Composite Video pin. Some devices (such as DVD players) 
may provide RGB on a SCART and hold pin 16 permanently high. 

When a source becomes active, it sets a 12V level on pin 8. This causes the TV to 
automatically switch to that SCART input. When the source stops, the signal returns to 
OV and TV viewing is resumed. If an anamorphic 16:9 program is present, the source 
raises the signal on pin 8 to only 6V. This causes the TV to switch to that SCART input 
and at the same time enable the video processing for anamorphic 16:9 programs. 



Table 5.1. SCART Connector Signals. 
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SDTV RGB Interface 

Some SDTV consumer video equipment 
supports an analog RGB' video interface. 
NTSC and PAL VBI (vertical blanking inter- 
val) data, discussed in Chapter 8, may be 
present on 480i or 576i R G B' video signals. 
Three separate RCA phono connectors (con- 
sumer market) or BNC connectors (pro-video 
and PC market) are used. 

The horizontal and vertical video timing 
are dependent on the video standard, as dis- 
cussed in Chapter 4. For sources, the video 
signal at the connector should have a source 
impedance of 75Q ±5%. For receivers, video 
inputs should be AC-coupled and have a 75-Q 
±5% input impedance. The three signals must 
be coincident with respect to each other within 
+5 ns. 

Sync information may be present on just 
the green channel, all three channels, as a sep- 
arate composite sync signal, or as separate hor- 
izontal and vertical sync signals. A gamma of 
1/0.45 is used. 

7.5 IRE Blanking Pedestal 

As shown in Figure 5.3, the nominal active 
video amplitude is 714 mV, including a 7.5 +2 
IRE blanking pedestal. A 286 +6 mV composite 
sync signal may be present on just the green 
channel (consumer market) , or all three chan- 
nels (pro-video market). DC offsets up to +1V 
may be present. 

Analog R'G'B' Generation 

Assuming 10-bit D/A converters (DACs) 
with an output range of 0-1.305V (to match the 
video DACs used by the NTSC/PAL encoder 
in Chapter 9), the 10-bit YCbCr to R G B' equa- 
tions are: 



R' = 0.591 (Y- 64) + 0.810(Cr- 512) 

G' = 0.591 (Y - 64) - 0.413(Cr - 512) - 
0.199(Cb - 512) 

B' = 0.591 (Y- 64) + 1.025 (Cb - 512) 

R'G'B' has a nominal 10-bit range of 0-518 
to match the active video levels used by the 
NTSC/PAL encoder in Chapter 9. Note that 
negative values of R G B' should be supported 
at this point. 

To implement the 7.5 IRE blanking pedes- 
tal, a value of 42 is added to the digital R G B' 
data during active video. 0 is added during the 
blanking time. 

After the blanking pedestal is added, the 
R G B' data is clamped by a blanking signal 
that has a raised cosine distribution to slow the 
slew rate of the start and end of the video sig- 
nal. For 480i and 576i systems, blank rise and 
fall times are 140 +20 ns. For 480p and 576p 
systems, blank rise and fall times are 70 ±10 
ns. 

Composite sync information may be added 
to the RGB' data after the blank processing 
has been performed. Values of 16 (sync 
present) or 240 (no sync) are assigned. The 
sync rise and fall times should be processed to 
generate a raised cosine distribution (between 
16 and 240) to slow the slew rate of the sync 
signal. For 480i and 576i systems, sync rise and 
fall times are 140 +20 ns, and horizontal sync 
width at the 50% point is 4.7 +0.1 ps. For 480p 
and 576p systems, sync rise and fall times are 
70 +10 ns, and horizontal sync width at the 50% 
point is 2.33 +0.05 ps. 

At this point, we have digital RGB' with 
sync and blanking information, as shown in 
Figure 5.3 and Table 5.2. The numbers in 
parentheses in Figure 5.3 indicate the data 
value for a 10-bit DAC with a full-scale output 
value of 1.305V. The digital R G B' data drives 
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1 .020 V 



0.357 V 
0.306 V 



0.020 V 




WHITE LEVEL (800) 



BLACK LEVEL (282) 
BLANK LEVEL (240) 



SYNC LEVEL (16) 



GREEN, BLUE, OR RED CHANNEL, SYNC PRESENT 



1 .020 V 



0.357 V 
0.306 V 




WHITE LEVEL (800) 



BLACK LEVEL (282) 
BLANK LEVEL (240) 



GREEN, BLUE, OR RED CHANNEL, NO SYNC PRESENT 



Figure 5.3. SDTV Analog RGB Levels. 7.5 IRE blanking level 
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1 .020 V 



0.321 V 



0.020 V 




WHITE LEVEL (800) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 



GREEN, BLUE, OR RED CHANNEL, SYNC PRESENT 




GREEN, BLUE, OR RED CHANNEL, NO SYNC PRESENT 



Figure 5.4. SDTV Analog RGB Levels. 0 IRE blanking level 



74 Chapter 5: Analog Video Interfaces 



three 10-bit DACs to generate the analog 
R G B' video signals. 

As the sample-and-hold action of the DAC 
introduces a (sin x)/x characteristic, the video 
data may be digitally filtered by a [(sin x)/x\~ 1 
filter to compensate. Alternately, as an analog 
lowpass filter is usually present after each 
DAC, the correction may take place in the ana- 
log filter. 



Video 


7.5 IRE 


0 IRE 


Level 


Blanking Pedestal 


Blanking Pedestal 


white 


800 


800 


black 


282 


252 


blank 


240 


252 


sync 


16 


16 



Table 5.2. SDTV 10-Bit R G B' Values. 



Analog R'G'B' Digitization 

Assuming 10-bit A/D converters (ADCs) 
with an input range of 0-1. 305V (to match the 
video ADCs used by the NTSC/PAL decoder 
in Chapter 9), the 10-bit R G B' to YCbCr equa- 
tions are: 

Y = 0.506 (R'- 282) +0.992(G'- 282) +0.193(B' 

- 282) + 64 

Cb = -0.291 (R' - 282) - 0.573(G' - 282) + 

0.864 (B' - 282) + 512 

Cr = 0.864 (R' - 282) - 0.724(G' - 282) - 
0.140(B' - 282) + 512 

R G B' has a nominal 10-bit range of 282- 
800 to match the active video levels used by 
the NTSC/PAL decoder in Chapter 9. Table 
5.2 and Figure 5.3 illustrate the 10-bit R G B' 
values for the white, black, blank, and 
(optional) sync levels. 



0 IRE Blanking Pedestal 

As shown in Figure 5.4, the nominal active 
video amplitude is 700 mV, with no blanking 
pedestal. A 300 +6 mV composite sync signal 
may be present on just the green channel (con- 
sumer market), or all three channels (pro- 
video market). DC offsets up to +1V may be 
present. 

Analog R'G'B' Generation 

Assuming 10-bit DACs with an output 
range of 0-1. 305V (to match the video DACs 
used by the NTSC/PAL encoder in Chapter 9) , 
the 10-bit YCbCr to R G B' equations are: 

R' = 0.625 (Y- 64) + 0.857 (Cr - 512) 

G' = 0.625(Y- 64) - 0.437 (Cr - 512) - 
0.210(Cb - 512) 

B' = 0.625 (Y- 64) + 1.084 (Cb - 512) 

R G B' has a nominal 10-bit range of 0-548 
to match the active video levels used by the 
NTSC/PAL encoder in Chapter 9. Note that 
negative values of R G B' should be supported 
at this point. 

The R G B' data is processed as discussed 
when using a 7.5 IRE blanking pedestal. How- 
ever, no blanking pedestal is added during 
active video, and the sync values are 16-252 
instead of 16-240. 

At this point, we have digital RGB' with 
sync and blanking information, as shown in 
Figure 5.4 and Table 5.2. The numbers in 
parentheses in Figure 5.4 indicate the data 
value for a 10-bit DAC with a full-scale output 
value of 1.305V. The digital R G B' data drives 
three 10-bit DACs to generate the analog 
R G B ' video signals. 
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Analog R'G'B' Digitization 

Assuming 10-bit ADCs with an input range 
of 0-1.305V (to match the video ADCs used by 
the NTSC/PAL decoder in Chapter 9), the 10- 
bit R G B' to YCbCr equations are: 

Y = 0.478 (R' - 252) + 0.938(G' - 252) + 0.182(B' 

- 252) + 64 

Cb = -0.275 (R' - 252) - 0.542(G' - 252) + 

0.817 (B' - 252) + 512 

Cr = 0.817 (R' - 252) - 0.685 (G' - 252) - 
0.132(B' - 252) + 512 

R G B' has a nominal 10-bit range of 252- 
800 to match the active video levels used by 
the NTSC/PAL decoder in Chapter 9. Table 
5.2 and Figure 5.4 illustrate the 10-bit RG B' 
values for the white, black, blank, and 
(optional) sync levels. 



HDTV RGB Interface 

Some HDTV consumer video equipment 
supports an analog RG B' video interface. 
Three separate RCA phono connectors (con- 
sumer market) or BNC connectors (pro-video 
and PC market) are used. 

The horizontal and vertical video timing 
are dependent on the video standard, as dis- 
cussed in Chapter 4. For sources, the video 
signal at the connector should have a source 
impedance of 75Q +5%. For receivers, video 
inputs should be AC-coupled and have a 75-Q 
+5% input impedance. The three signals must 
be coincident with respect to each other within 
+5 ns. 

Sync information may be present on just 
the green channel, all three channels, as a sep- 
arate composite sync signal, or as separate hor- 
izontal and vertical sync signals. A gamma of 
1/0.45 is used. 



As shown in Figure 5.5, the nominal active 
video amplitude is 700 mV, and has no blank- 
ing pedestal. A +300 +6 mV tri-level composite 
sync signal may be present on just the green 
channel (consumer market) , or all three chan- 
nels (pro-video market). DC offsets up to +1V 
may be present. 

Analog R'G'B' Generation 

Assuming 10-bit DACs with an output 
range of 0-1. 305V (to match the video DACs 
used by the NTSC/PAL encoder in Chapter 9) , 
the 10-bit YCbCr to R G B' equations are: 

R' = 0.625 (Y- 64) + 0.963 (Cr - 512) 

G' = 0.625(Y- 64) -0.287 (Cr- 512) - 
0.114(Cb - 512) 

B' = 0.625(Y- 64) + 1.136(Cb - 512) 

R G B' has a nominal 10-bit range of 0-548 
to match the active video levels used by the 
NTSC/PAL encoder in Chapter 9. Note that 
negative values of R G B' should be supported 
at this point. 

The R G B' data is clamped by a blanking 
signal that has a raised cosine distribution to 
slow the slew rate of the start and end of the 
video signal. For 1080i and 720p systems, 
blank rise and fall times are 54 +20 ns. For 
1080p systems, blank rise and fall times are 27 
+10 ns. 

Composite sync information may be added 
to the R G B' data after the blank processing 
has been performed. Values of 16 (sync low), 
488 (high sync) , or 252 (no sync) are assigned. 
The sync rise and fall times should be pro- 
cessed to generate a raised cosine distribution 
to slow the slew rate of the sync signal. For 
10801 systems, sync rise and fall times are 54 
+20 ns, and the horizontal sync low and high 
widths at the 50% points are 593 +40 ns. For 
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1 .020 V 



0.622 V 



0.321 V 



0.020 V 




WHITE LEVEL (800) 



SYNC LEVEL (488) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 



GREEN, BLUE, OR RED CHANNEL, SYNC PRESENT 




GREEN, BLUE, OR RED CHANNEL, NO SYNC PRESENT 



Figure 5.5. HDTV Analog RGB Levels. 0 IRE blanking level 
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720p systems, sync rise and fall times are 54 
+20 ns, and the horizontal sync low and high 
widths at the 50% points are 539 +40 ns. For 
1080p systems, sync rise and fall times are 27 
+10 ns, and the horizontal sync low and high 
widths at the 50% points are 296 +20 ns. 

At this point, we have digital R G B' with 
sync and blanking information, as shown in 
Figure 5.5 and Table 5.3. The numbers in 
parentheses in Figure 5.5 indicate the data 
value for a 10-bit DAC with a full-scale output 
value of 1.305V. The digital R G B' data drive 
three 10-bit DACs to generate the analog 
R G B' video signals. 



Video 


0 IRE 


Level 


Blanking Pedestal 


white 


800 


sync high 


488 


black 


252 


blank 


252 


sync low 


16 



Table 5.3. HDTV 10-Bit R G B' Values. 



Analog R'G'B' Digitization 

Assuming 10-bit ADCs with an input range 
of 0-1.305V (to match the video ADCs used by 
the NTSC/PAL decoder in Chapter 9), the 10- 
bit R'G'B' to YCbCr equations are: 

Y = 0.341 (R'- 252) + 1.143(G'- 252) + 
0.115(B'-252) +64 

Cb = -0.188 (R' - 252) - 0.629(G' - 252) + 

0.817 (B' - 252) + 512 

Cr = 0.817 (R' - 252) - 0.743 (G' - 252) - 
0.074 (B' - 252) + 512 



R G B' has a nominal 10-bit range of 252- 
800 to match the active video levels used by 
the NTSC/PAL decoder in Chapter 9. Table 
5.3 and Figure 5.5 illustrate the 10-bit R G B' 
values for the white, black, blank, and 
(optional) sync levels. 

Constrained Image 

Due to the limited availability of copy pro- 
tection technology for high-definition analog 
interfaces, some standards and DRM imple- 
mentations only allow a constrained image to 
be output. A constrained image has an effec- 
tive maximum resolution of 960 x 540p, 
although the total number of video samples 
and the video timing remain unchanged (for 
example, 1280 x 720p or 1920 x 10801) . 

In these situations, the full resolution 
image is still available via an approved secure 
digital video output, such as HDMI. 



SDTV YPbPr Interface 

Some SDTV consumer video equipment 
supports an analog YPbPr video interface. 
NTSC and PAL VBI (vertical blanking inter- 
val) data, discussed in Chapter 8, may be 
present on 480i or 576i Y video signals. Three 
separate RCA phono connectors (consumer 
market) or BNC connectors (pro-video mar- 
ket) are used. 

The horizontal and vertical video timing 
are dependent on the video standard, as dis- 
cussed in Chapter 4. For sources, the video 
signal at the connector should have a source 
impedance of 75Q +5%. For receivers, video 
inputs should be AC-coupled and have a 75-Q 
+5% input impedance. The three signals must 
be coincident with respect to each other within 
+5 ns. 
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1 .020 V 



0.321 V 



0.020 V 



1 .003 V 



0.653 V 



0.303 V 




WHITE LEVEL (800) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 




PEAK LEVEL (786) 



BLACK /BLANK LEVEL (512) 



PEAK LEVEL (238) 



PB OR PR CHANNEL, NO SYNC PRESENT 



Figure 5.6. EIA-770.2 SDTV Analog YPbPr Levels. Sync on Y. 
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1 .020 V 



0.321 V 



0.020 V 




WHITE LEVEL (800) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 



Y CHANNEL, SYNC PRESENT 



1 .003 V 



0.653 V 



0.352 V 
0.303 V 




PEAK LEVEL (786) 



BLACK /BLANK LEVEL (512) 



SYNC LEVEL (276) 
PEAK LEVEL (238) 



PB OR PR CHANNEL, SYNC PRESENT 



Figure 5.7. SDTV Analog YPbPr Levels. Sync on YPbPr. 
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White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


100 


88.6 


70.1 


58.7 


41.3 


29.9 


11.4 


0 


mV 


700 


620 


491 


411 


289 


209 


80 


0 


Pb 


IRE 


0 


-50 


16.8 


-33.1 


33.1 


-16.8 


50 


0 


mV 


0 


-350 


118 


-232 


232 


-118 


350 


0 


Pr 


IRE 


0 


8.1 


-50 


-41.8 


41.8 


50 


-8.1 


0 


mV 


0 


57 


-350 


-293 


293 


350 


-57 


0 


Y 


64 to 940 


940 


840 


678 


578 


426 


326 


164 


64 


Cb 


64 to 960 


512 


64 


663 


215 


809 


361 


960 


512 


Cr 


64 to 960 


512 


585 


64 


137 


887 


960 


439 


512 



Table 5.4. EIA-770.2 SDTV YPbPr and YCbCr 100% Color Bars. YPbPr values 
relative to the blanking level. 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


75 


66.5 


52.6 


44 


31 


22.4 


8.6 


0 


mV 


525 


465 


368 


308 


217 


157 


60 


0 


Pb 


IRE 


0 


-37.5 


12.6 


-24.9 


24.9 


-12.6 


37.5 


0 


mV 


0 


-262 


88 


-174 


174 


-88 


262 


0 


Pr 


IRE 


0 


6.1 


-37.5 


-31.4 


31.4 


37.5 


-6.1 


0 


mV 


0 


43 


-262 


-220 


220 


262 


-43 


0 


Y 


64 to 940 


721 


646 


525 


450 


335 


260 


139 


64 


Cb 


64 to 960 


512 


176 


625 


289 


735 


399 


848 


512 


Cr 


64 to 960 


512 


567 


176 


231 


793 


848 


457 


512 



Table 5.5. EIA-770.2 SDTV YPbPr and YCbCr 75% Color Bars. YPbPr values 
relative to the blanking level. 
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For consumer products, composite sync is 
present on only the Y channel. For pro-video 
applications, composite sync is present on all 
three channels. A gamma of 1/0.45 is speci- 
fied. 

As shown in Figures 5.6 and 5.7, the Y sig- 
nal consists of 700 mV of active video (with no 
blanking pedestal). Pb and Pr have a peak-to- 
peak amplitude of 700 mV. A 300 +6 mV com- 
posite sync signal is present on just the Y chan- 
nel (consumer market), or all three channels 
(pro-video market). DC offsets up to +1V may 
be present. The 100% and 75% YPbPr color bar 
values are shown in Tables 5.4 and 5.5. 

Analog YPbPr Generation 

Assuming 10-bit DACs with an output 
range of 0-1.305V (to match the video DACs 
used by the NTSC/PAL encoder in Chapter 9) , 
the 10-bit YCbCr to YPbPr equations are: 

Y = ((800 - 252) / (940 - 64)) (Y - 64) 

Pb = ((800 - 252)/ (960 - 64)) (Cb - 512) 

Pr = ((800 - 252)/ (960 - 64)) (Cr - 512) 

Y has a nominal 10-bit range of 0-548 to 
match the active video levels used by the 
NTSC/PAL encoder in Chapter 9. Pb and Pr 
have a nominal 10-bit range of 0 to +274. Note 
that negative values of Y should be supported 
at this point. 

The YPbPr data is clamped by a blanking 
signal that has a raised cosine distribution to 
slow the slew rate of the start and end of the 
video signal. For 480i and 576i systems, blank 
rise and fall times are 140 +20 ns. For 480p and 
576p systems, blank rise and fall times are 70 
+10 ns. 



Composite sync information is added to 
the Y data after the blank processing has been 
performed. Values of 16 (sync present) or 252 
(no sync) are assigned. The sync rise and fall 
times should be processed to generate a raised 
cosine distribution (between 16 and 252) to 
slow the slew rate of the sync signal. 

Composite sync information may also be 
added to the PbPr data after the blank process- 
ing has been performed. Values of 276 (sync 
present) or 512 (no sync) are assigned. The 
sync rise and fall times should be processed to 
generate a raised cosine distribution (between 
276 and 512) to slow the slew rate of the sync 
signal. 

For 480i and 576i systems, sync rise and 
fall times are 140 +20 ns, and horizontal sync 
width at the 50% point is 4.7 +0.1 ps. For 480p 
and 576p systems, sync rise and fall times are 
70 +10 ns, and horizontal sync width at the 50% 
point is 2.33 +0.05 ps. 

At this point, we have digital YPbPr with 
sync and blanking information, as shown in 
Figures 5.6 and 5.7 and Table 5.6. The num- 
bers in parentheses in Figures 5.6 and 5.7 indi- 
cate the data value for a 10-bit DAC with a full- 
scale output value of 1.305V. The digital YPbPr 
data drive three 10-bit DACs to generate the 
analog YPbPr video signals. 



Video 

Level 


Y 


PbPr 


white 


800 


512 


black 


252 


512 


blank 


252 


512 


sync 


16 


276 



Table 5.6. SDTV 10-Bit YPbPr Values. 
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Analog YPbPr Digitization 

Assuming 10-bit ADCs with an input range 
of 0-1. 305V (to match the video ADCs used by 
the NTSC/PAL decoder in Chapter 9) , the 10- 
bit YPbPr to YCbCr equations are: 

Y = 1.5985 (Y- 252) +64 
Cb = 1.635 (Pb - 512) + 512 
Cr = 1.635(Pr - 512) + 512 

Y has a nominal 10-bit range of 252-800 to 
match the active video levels used by the 
NTSC/PAL decoder in Chapter 9. Table 5.6 
and Figures 5.6 and 5.7 illustrate the 10-bit 
YPbPr values for the white, black, blank, and 
(optional) sync levels. 

VBI Data for 480p Systems 

CGMS Type A 

CEA-805, IEC 61880-2, and EIA-J CPR- 
1204-1 define the format of CGMS (Copy Gen- 
eration Management System) data on line 41 



for 480p systems. The waveform is illustrated 
in Figure 5.8. 

A sample clock rate of 27 MHz (59.94 Hz 
frame rate) or 27.027 MHz (60 Hz frame rate) 
is used. Each data bit is 26 clock cycles, or 963 
+30 ns, wide with a maximum rise and fall time 
of 50 ns. A logical “1” has an amplitude of 70 
+10 IRE; a logical “0” has an amplitude of 0 +5 
IRE. 

The 2-bit start symbol begins 156 clock 
cycles, or about 5.778 ps, after 0 H . It consists of 
a “1” followed by a “0.” 

The 6-bit header follows the start symbol, 
and defines the nature of the payload data as 
shown in Table 5.7. The End of Message 
immediately follows the last packet of any data 
service that uses more than one packet. It has 
an associated payload consisting of all zeros. 
ECCI is a data service that may use more than 
one packet, thus requiring the use of the End 
of Message. 

The 14-bit payload for CGMS data is 
shown in Table 5.8. The 14-bit payload for 
ECCI data is currently “reserved,” consisting 
of all ones. 





START 


HEADER 


DATA 




SYMBOL 


(HO - H5) 


(DO - D13) 


70 ±10 IRE 


2 


6 


14 




BITS 


BITS 


BITS 



BLANK LEVEL 



SYNC LEVEL 



5.778 pS 



Figure 5.8. CEA-805, IEC 61880-2, and EIA-J CPR-1204-1 Line 41 Timing. 
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HO 


HI 


Aspect Ratio 


Picture Display Format 


0 


0 


4:3 


normal 


0 


1 


4:3 


letter box 


1 


0 


16:9 


normal 


1 


1 


CEA-805 Type A packet 



H2 


H3 


H4 


H5 


Service Name 


0 


0 


0 


0 


CGMS (see Table 5.8) 


0 


0 


0 


1 


Extended Copy Control Information (ECCI) 


0 


0 


1 


0 


reserved 




1 


1 


1 


0 


1 


1 


1 


1 


End of Message (default if no copyright information) 



Table 5.7. CEA-805, IEC 61880-2, and EIA-J CPR-1204-1 Line 41 
Header Format. The H2-H5 bits must be “0000” if Type A packet is 
indicated. 
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DO 


D1 


D2 


D3 


D4 


D5 


D6 


D7 


D8 


D9 


DIO 


Dll 


D12 


D13 


GO 


G1 


G2 


G3 


ASB 


0 


0 


0 


CRC = x 6 + x + 1 



G0-G1: CGMS Definition 

00 copying permitted 

01 no more copies (one copy has already been made) 

10 one copy permitted 

11 no copying permitted 

G2-G3: Analog Protection Service (valid only if both G0-G1 are "01” or “10”) 

00 no Analog Protection Service 

01 PSP on, color striping off 

10 PSP on, 2-line color striping on 

11 PSP on, 4-line color striping on 

ASB: Analog Source Bit 

0 not analog pre-recorded medium 

1 analog pre-recorded medium 



Table 5.8. CEA-805, IEC 61880-2, and EIA-J CPR-1204-1 Line 41 CGMS Service Format. 



CGMS Type B 

CEA-805 defines the format of CGMS 
(Copy Generation Management System) data 
on line 40 for 480p systems. The waveform is 
illustrated in Figure 5.9. 

A sample clock rate of 27 MHz (59.94 Hz 
frame rate) or 27.027 MHz (60 Hz frame rate) 
is used. Each data bit is four clock cycles, or 
148 ±18.5 ns, wide with a maximum rise and 
fall time of 37 ns. A logical “1” has an amplitude 
of 70 +10 IRE; a logical “0” has an amplitude of 
0 +5 IRE. 



The 2-bit start symbol begins 156 clock 
cycles, or about 5.778 ps, after 0 H . It consists of 
a “1” followed by a “0.” 

The 6-bit header follows the start symbol, 
and defines the nature of the payload data as 
shown in Table 5.9. 

The 16-byte payload is shown in Table 
5.10. 
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BLANK LEVEL 



SYNC LEVEL 



START 


HEADER 


DATA 


SYMBOL 


(HO - H5) 


(DO - D127) 


2 


6 


128 


BITS 


BITS 


BITS 



5.778 nS 



Figure 5.9. CEA-805 Line 40 Timing. 



HO 


HI 


H2 


H3 


H4 


H5 


Service Name 


0 


0 


0 


0 


0 


0 


reserved for future use 




1 


1 


0 


0 


0 


1 


1 


1 


0 


0 


1 


0 


Type B packet 


1 


1 


0 


0 


1 


0 


reserved for future use 




1 


1 


1 


1 


1 


1 



Table 5.9. CEA-805 Line 40 Header Format. 
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D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 



version number = 0000 0001 



length of payload packet = 0001 0000 



AR1 


AR0 


ASB 


A0 


1 


B0 


SI 


so 


C3 


C2 


Cl 


CO 


R3 


R2 


R1 


R0 


RCI 


1 


1 


1 


G3 


G2 


G1 


GO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



line number of end of top bar (lower 8 bits) 



line number of end of top bar (upper 8 bits) 
line number of start of bottom bar flower 8 bits) 
line number of start of bottom bar (upper 8 bits) 
pixel number of end of left bar (lower 8 bits) 
pixel number of end of left bar (upper 8 bits) 
pixel number of start of right bar flower 8 bits) 
pixel number of start of right bar (upper 8 bits) 
1 1 CRC = x 6 + x + 1 



AR1-AR0: Intended display aspect ratio 


C3-C0: Colorimetry 


00 


4:3 normal 


0000 


no data 


01 


4:3 letterbox 


0001 


BT.601 


10 


16:9 normal 


0010 


BT.709 


11 


reserved 


0011 


reserved 


ASB: Analog Source Bit 


1111 


reserved 



A0: Active Format Description (AFD) data flag R0-R3: Active Format Description (AFD) 

0 no AFD data (R0-R3) active_format value (refer to Table 13.56) 

1 AFD data (R0-R3) valid 

RCI: Redistribution Control Information (RCI) flag 

B0: Bar data (for letterboxing) 

0 no bar data G0-G1: CGMS Definition (refer to Table 5.8) 

1 bar data present 

G2-G3: Analog Protection Services (refer to Table 5.8) 

S1-S0: Scan data (amount of overscan and 
underscan is not indicated) 

00 no data 

01 overscanned (television) 

10 underscanned (computer) 

11 reserved 



Table 5.10. CEA-805 Line 40 Payload Format. 
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VBI Data for 576p Systems 

CGMS 

IEC 62375 defines the format of CGMS 
(Copy Generation Management System) and 
widescreen signaling (WSS) data on line 43 for 
576p systems. The waveform is illustrated in 
Figure 5.10. This standard allows a WSS- 
enhanced 16:9 TV to display programs in their 
correct aspect ratio. 

Data Timing 

CGMS and WSS data is normally on line 
43, as shown in Figure 5.10. However, due to 
video editing, the data may reside on any line 
between 43-47. 

The clock frequency is 10 MHz (+1 kHz). 
The signal waveform should be a sine-squared 
pulse, with a half-amplitude duration of 100 +10 
ns. The signal amplitude is 500 mV +5%. 



The NRZ data bits are processed by a bi- 
phase code modulator, such that one data 
period equals 6 elements at 10 MHz. 

Data Content 

The WSS consists of a run-in code, a start 
code, and 14 bits of data, as shown in Table 
5.11. 

Run-In 

The run-in consists of 29 elements of a spe- 
cific sequence at 10 MHz, shown in Table 5.11. 

Start Code 

The start code consists of 24 elements of a 
specific sequence at 10 MHz, shown in Table 
5.11. 



500 MV ±5% 



BLANK LEVEL 



RUN 


START 


DATA 


IN 


CODE 


(B0- B13) 


29 


24 


84 


| 10 MHZ 


10 MHZ 


10 MHZ 


ELEMENTS 


ELEMENTS 


ELEMENTS 



90-110 NS 
RISE / FALL 
TIMES 
(2T BAR 
SHAPING) 



43 IRE 



SYNC LEVEL 



5.5 ±0.125 



liS 



13.7 pS 



Figure 5.10. IEC 62375 Line 43 CGMS Timing. 
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Group 1 Data 

The group 1 data consists of 4 data bits 
that specify the aspect ratio. Each data bit gen- 
erates 6 elements at 10 MHz. bO is the LSB. 

Table 5.11 lists the data bit assignments 
and usage. The number of active lines listed in 
Table 5.12 are for the exact aspect ratio (a = 
1.33, 1.56, or 1.78). 

The aspect ratio label indicates a range of 
possible aspect ratios (a) and number of active 
lines: 



4:3 


a < 1.46 




527-576 


14:9 


1.46 < a < 


1.66 


463-526 


16:9 


1.66 < a < 


1.90 


405-462 


>16:9 


a > 1.90 




<405 



To allow automatic selection of the display 
mode, a 16:9 receiver should support the fol- 
lowing minimum requirements: 

Case 1: The 4:3 aspect ratio picture should be cen- 
tered on the display, with black bars on the left 
and right sides. 

Case 2: The 14:9 aspect ratio picture should be 
centered on the display, with black bars on the left 
and right sides. Alternately, the picture may be 
displayed using the full display width by using a 
small (typically 8%) horizontal geometrical error. 

Case 3: The 16:9 aspect ratio picture should be dis- 
played using the full width of the display. 

Case 4: The >16:9 aspect ratio picture should be 
displayed as in Case 3 or use the full height of the 
display by zooming in. 



Group 3 Data 

The group 3 data consists of three data bits 
that specify subtitles. Each data bit generates 
six elements at 10 MHz. Data bit b8 is the LSB. 

b9, blO: open subtitles 

00 no 

01 outside active picture 

10 inside active picture 

11 reserved 



Group 4 Data 

The group 4 data consists of three data bits 
that specify surround sound and copy protec- 
tion. Each data bit generates six elements at 10 
MHz. Data bit bll is the LSB. 

bll: surround sound 

0 no 

1 yes 

bl2: copyright 

0 no copyright asserted or unknown 

1 copyright asserted 

bl3: copy protection 

0 copying not restricted 

1 copying restricted 
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run-in 


29 elements 
at 10 MHz 


1 1111 0001 1100 0111 0001 1100 0111 
(0xlFlC71C7) 


start code 


24 elements 
at 10 MHz 


0001 1110 0011 1100 0001 1111 
(0xlE3ClF) 


group 1 
(aspect ratio) 


24 elements 
at 10 MHz 
"0” = 000 111 
“1” = 111 000 


bO, bl, b2, b3 


group 2 

(enhanced services) 


24 elements 
at 10 MHz 
"0” = 000 111 
“1” = 111 000 


b4, b5, b6, b7 

(b4, b5, b6 and b7 = “0” since reserved) 


group 3 
(subtitles) 


18 elements 
at 10 MHz 
"0” = 000 111 
“1” = 111 000 


b8, b9, blO 

(b8 = “0” since reserved) 


group 4 
(reserved) 


18 elements 
at 10 MHz 
"0” = 000 111 
“1” = 111 000 


bll, bl2, bl3 



Table 5.11. IEC 62375 Line 43 WSS Information. 



bO, bl, b2, b3 


Aspect Ratio 
Label 


Format 


Position On 
4:3 Display 


Active 

Lines 


Minimum 

Requirements 


0001 


4:3 


full format 


- 


576 


case 1 


1000 


14:9 


letterbox 


center 


504 


case 2 


0100 


14:9 


letterbox 


top 


504 


case 2 


1101 


16:9 


letterbox 


center 


430 


case 3 


0010 


16:9 


letterbox 


top 


430 


case 3 


1011 


>16:9 


letterbox 


center 


- 


case 4 


0111 


14:9 


full format 


center 


576 


- 


1110 


16:9 


full format 
(anamorphic) 


- 


576 


- 



Table 5.12. IEC 62375 Group 1 (Aspect Ratio) Data Bit Assignments and Usage, 
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HDTV YPbPr Interface 

Most HDTV consumer video equipment 
supports an analog YPbPr video interface. 
Three separate RCA phono connectors (con- 
sumer market) or BNC connectors (pro-video 
market) are used. 

The horizontal and vertical video timing is 
dependent on the video standard, as discussed 
in Chapter 4. For sources, the video signal at 
the connector should have a source impedance 
of 75Q ±5%. For receivers, video inputs should 
be AC-coupled and have a 75-Q +5% input 
impedance. The three signals must be coinci- 
dent with respect to each other within +5 ns. 

For consumer products, composite sync is 
present on only the Y channel. For pro-video 
applications, composite sync is present on all 
three channels. A gamma of 1/0.45 is speci- 
fied. 

As shown in Figures 5.11 and 5.12, the Y 
signal consists of 700 mV of active video (with 
no blanking pedestal) . Pb and Pr have a peak- 
to-peak amplitude of 700 mV. A +300 +6 mV 
composite sync signal is present on just the Y 
channel (consumer market) , or all three chan- 
nels (pro-video market). DC offsets up to +1V 
may be present. The 100% and 75% YPbPr color 
bar values are shown in Tables 5.13 and 5.14. 

Analog YPbPr Generation 

Assuming 10-bit DACs with an output 
range of 0-1.305V (to match the video DACs 
used by the NTSC/PAL encoder in Chapter 9) , 
the 10-bit YCbCr to YPbPr equations are: 

Y = ((800 - 252) / (940 - 64)) (Y - 64) 

Pb = ((800 - 252)/ (960 - 64)) (Cb - 512) 

Pr = ((800 - 252)/ (960 - 64)) (Cr - 512) 



Y has a nominal 10-bit range of 0-548 to 
match the active video levels used by the 
NTSC/PAL encoder in Chapter 9. Pb and Pr 
have a nominal 10-bit range of 0 to +274. Note 
that negative values of Y should be supported 
at this point. 

The YPbPr data is clamped by a blanking 
signal that has a raised cosine distribution to 
slow the slew rate of the start and end of the 
video signal. For 1080i and 720p systems, 
blank rise and fall times are 54 +20 ns. For 
1080p systems, blank rise and fall times are 27 
+10 ns. 

Composite sync information is added to 
the Y data after the blank processing has been 
performed. Values of 16 (sync low), 488 (high 
sync) , or 252 (no sync) are assigned. The sync 
rise and fall times should be processed to gen- 
erate a raised cosine distribution to slow the 
slew rate of the sync signal. 

Composite sync information may be added 
to the PbPr data after the blank processing has 
been performed. Values of 276 (sync low), 748 
(high sync), or 512 (no sync) are assigned. 
The sync rise and fall times should be pro- 
cessed to generate a raised cosine distribution 
to slow the slew rate of the sync signal. 

For 1080i systems, sync rise and fall times 
are 54 +20 ns, and the horizontal sync low and 
high widths at the 50% points are 593 +40 ns. 
For 720p systems, sync rise and fall times are 
54 +20 ns, and the horizontal sync low and 
high widths at the 50% points are 539 +40 ns. 
For 1080p systems, sync rise and fall times are 
27 +10 ns, and the horizontal sync low and 
high widths at the 50%-points are 296 +20 ns. 

At this point, we have digital YPbPr with 
sync and blanking information, as shown in 
Figures 5.11 and 5.12 and Table 5.15. The num- 
bers in parentheses in Figures 5.11 and 5.12 
indicate the data value for a 10-bit DAC with a 
full-scale output value of 1.305V. The digital 
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1 .020 V 



0.622 V 



0.321 V 



0.020 V 



1 .003 V 



0.653 V 



0.303 V 




WHITE LEVEL (800) 



SYNC LEVEL (488) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 




PEAK LEVEL (786) 



BLACK /BLANK LEVEL (512) 



PEAK LEVEL (238) 



PB OR PR CHANNEL, NO SYNC PRESENT 



Figure 5.11. EIA-770.3 HDTV Analog YPbPr Levels. Sync on Y. 
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1 .020 V 



0.622 V 



0.321 V 



0.020 V 




WHITE LEVEL (800) 



SYNC LEVEL (488) 



BLACK /BLANK LEVEL (252) 



SYNC LEVEL (16) 



Y CHANNEL, SYNC PRESENT 



1 .003 V 
0.954 V 



0.653 V 



0.352 V 
0.303 V 




PEAK LEVEL (786) 
SYNC LEVEL (748) 



BLACK /BLANK LEVEL (512) 



SYNC LEVEL (276) 
PEAK LEVEL (238) 



PB OR PR CHANNEL, SYNC PRESENT 



Figure 5.12. SMPTE 274M and 296M HDTV Analog YPbPr Levels. Sync on YPbPr. 
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White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


100 


92.8 


78.7 


71.5 


28.5 


21.3 


7.2 


0 


mV 


700 


650 


551 


501 


200 


149 


50 


0 


Pb 


IRE 


0 


-50 


11.4 


-38.5 


38.5 


-11.4 


50 


0 


mV 


0 


-350 


80 


-270 


270 


-80 


350 


0 


Pr 


IRE 


0 


4.6 


-50 


-45.4 


45.4 


50 


-4.6 


0 


mV 


0 


32 


-350 


-318 


318 


350 


-32 


0 


Y 


64 to 940 


940 


877 


753 


690 


314 


251 


127 


64 


Cb 


64 to 960 


512 


64 


614 


167 


857 


410 


960 


512 


Cr 


64 to 960 


512 


553 


64 


106 


918 


960 


471 


512 



Table 5.13. EIA-770.3 HDTV YPbPr and YCbCr 100% Color Bars. YPbPr values 
relative to the blanking level. 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


75 


69.6 


59 


53.7 


21.3 


16 


5.4 


0 


mV 


525 


487 


413 


376 


149 


112 


38 


0 


Pb 


IRE 


0 


-37.5 


00 


-28.9 


28.9 


-8.6 


37.5 


0 


mV 


0 


-263 


60 


-202 


202 


-60 


263 


0 


Pr 


IRE 


0 


3.5 


-37.5 


-34 


34 


37.5 


-3.5 


0 


mV 


0 


24 


-263 


-238 


238 


263 


-24 


0 


Y 


64 to 940 


721 


674 


581 


534 


251 


204 


111 


64 


Cb 


64 to 960 


512 


176 


589 


253 


771 


435 


848 


512 


Cr 


64 to 960 


512 


543 


176 


207 


817 


848 


481 


512 



Table 5.14. EIA-770.3 HDTV YPbPr and YCbCr 75% Color Bars. YPbPr values 
relative to the blanking level. 
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YPbPr data drives three 10-bit DACs to gener- 
ate the analog YPbPr video signals. 



Video 

Level 


Y 


PbPr 


white 


800 


512 


sync high 


488 


748 


black 


252 


512 


blank 


252 


512 


sync low 


16 


276 



Table 5.15. HDTV 10-Bit YPbPr Values. 



Analog YPbPr Digitization 

Assuming 10-bit ADCs with an input range 
of 0-1. 305V (to match the video ADCs used by 
the NTSC/PAL decoder in Chapter 9), the 10- 
bit YPbPr to YCbCr equations are: 



Y = 1.5985 (Y- 252) +64 
Cb = 1.635(Pb - 512) + 512 
Cr = 1.635(Pr - 512) + 512 



Y has a nominal 10-bit range of 252-800 to 
match the active video levels used by the 
NTSC/PAL decoder in Chapter 9. Table 5.15 
and Figures 5.11 and 5.12 illustrate the 10-bit 
YPbPr values for the white, black, blank, and 
(optional) sync levels. 



VBI Data for 720p Systems 

CGMS Type A 

CEA-805 and EIA-J CPR-1204-2 define the 
format of CGMS (Copy Generation Manage- 
ment System) data on line 24 for 720p systems. 
The waveform is illustrated in Figure 5.13. 

A sample clock rate of 74.176 MHz (59.94 
Hz frame rate) or 74.25 MHz (60 Hz frame 
rate) is used. Each data bit is 58 clock cycles, 
or 782 ±30 ns, wide with a maximum rise and 
fall time of 50 ns. A logical “1” has an amplitude 
of 70 +10 IRE; a logical “0” has an amplitude of 
0 +5 IRE. 

The 2-bit start symbol begins 232 clock 
cycles, or about 3.128 ps, after Op. It consists of 
a “1” followed by a “0.” 

The 6-bit header and 14-bit CGMS payload 
data format is the same as for 480p systems 
discussed earlier in this chapter. 

CGMS Type B 

CEA-805 defines the format of CGMS 
(Copy Generation Management System) data 
on line 23 for 720p systems. The waveform is 
illustrated in Figure 5.14. 

A sample clock rate of 74.176 MHz (59.94 
Hz frame rate) or 74.25 MHz (60 Hz frame 
rate) is used. Each data bit is eight clock 
cycles, or 107.7 +18.5 ns, wide with a maximum 
rise and fall time of 37 ns. A logical “1” has an 
amplitude of 70 +10 IRE; a logical “0” has an 
amplitude of 0 +5 IRE. 

The 2-bit start symbol begins 232 clock 
cycles, or about 3.128 ps, after Op. It consists of 
a “1” followed by a “0.” 





HDTV YPbPr Interface 95 



BLANK LEVEL 



START 


HEADER 


DATA 


SYMBOL 


(HO - H5) 


(D0-D13) 


2 


6 


14 


BITS 


BITS 


BITS 



SYNC LEVEL 



3.128 nS 



Figure 5.13. CEA-805 and EIA-J CPR-1204-2 Line 24 Timing. 



BLANK LEVEL 



START 


HEADER 


DATA 


SYMBOL 


(HO - H5) 


(D0-D127) 


2 


6 


128 


BITS 


BITS 


BITS 



SYNC LEVEL 



3.128 nS 



Figure 5.14. CEA-805 Line 23 Timing, 
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The 6-bit header and 16-byte payload data 
format is the same as for 480p systems dis- 
cussed earlier in this chapter. 

VBI Data for 1080i Systems 

CGMS Type A 

CEA-805 and EIA-J CPR-1204-2 define the 
format of CGMS (Copy Generation Manage- 
ment System) data on lines 19 and 582 for 
1080i systems. The waveform is illustrated in 
Figure 5.15. 

A sample clock rate of 74.176 MHz (59.94 
Hz field rate) or 74.25 MHz (60 Hz field rate) is 
used. Each data bit is 77 clock cycles, or 1038 
+30 ns, wide with a maximum rise and fall time 
of 50 ns. A logical “1” has an amplitude of 70 
+10 IRE; a logical “0” has an amplitude of 0 +5 
IRE. 

The 2-bit start symbol begins 308 clock 
cycles, or about 4.152 ps, after Op. It consists of 
a “1” followed by a “0.” 



The 6-bit header and 14-bit CGMS payload 
data format is the same as for 480p systems 
discussed earlier in this chapter. 

CGMS Type B 

CEA-805 defines the format of CGMS 
(Copy Generation Management System) data 
on lines 18 and 581 for 1080i systems. The 
waveform is illustrated in Figure 5.16. 

A sample clock rate of 74.176 MHz (59.94 
Hz frame rate) or 74.25 MHz (60 Hz frame 
rate) is used. Each data bit is ten clock cycles, 
or 134.6 +18.5 ns, wide with a maximum rise 
and fall time of 37 ns. A logical “1” has an 
amplitude of 70 +10 IRE; a logical “0” has an 
amplitude of 0 +5 IRE. 

The 2-bit start symbol begins 308 clock 
cycles, or about 4.152 ps, after Op. It consists of 
a “1” followed by a “0.” 

The 6-bit header and 16-byte payload data 
format is the same as for 480p systems dis- 
cussed earlier in this chapter. 



START HEADER DATA 

_ SYMBOL (H0-H5) (D0-D13) 

70 ±10 IRE 2 6 14 

BITS BITS BITS 



BLANK LEVEL 



SYNC LEVEL 



4.152 |iS 



Figure 5.15. CEA-805 and EIA-J CPR-1204-2 Lines 19 and 582 Timing. 
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BLANK LEVEL 



SYNC LEVEL 



START 


HEADER 


DATA 


SYMBOL 


(HO - H5) 


(DO - D127) 


2 


6 


128 


BITS 


BITS 


BITS 



4.152 nS 



Figure 5.16. CEA-805 Lines 18 and 581 Timing. 



Constrained Image 

Due to the limited availability of copy pro- 
tection technology for high-definition analog 
interfaces, some standards and DRM imple- 
mentations only allow a constrained image to 
be output. A constrained image has an effec- 
tive maximum resolution of 960 x 540p, 
although the total number of video samples 
and the video timing remain unchanged (for 
example, 1280 x 720p60 or 1920 x 1080i30) . 

In these situations, the full resolution 
image is still available via an approved secure 
digital video output, such as HDMI. 



D-Connector Interface 

A 14-pin female D-connector (EIA-J CP- 
4120 standard, EIA-J RC-5237 connector) is 
optionally used on some high-end consumer 
equipment in Japan, Hong Kong, and Sin- 
gapore. It is used to transfer EIA 770.2 or EIA 
770.3 interlaced or progressive analog YPbPr 
video. 

There are five flavors of the D-connector, 
referred to as Dl, D2, D3, D4, and D5, each 
used to indicate supported video formats, as 
shown in Table 5.16. Figure 5.17 illustrates the 
connector and Table 5.17 lists the pin names. 

Three line signals (Line 1, Line 2, and Line 
3) indicate the resolution and frame rate of the 
YPbPr source video, as indicated in Table 5.18. 
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o 

00 


Q. 

o 

00 


720p 


o 

00 

o 

H 


a. 

© 

00 

o 

H 


D1 


X 










D2 


X 


X 








D3 


X 


X 




X 




D4 


X 


X 


X 


X 




D5 


X 


X 


X 


X 


X 



Table 5.16. D-Connector Supported Video Formats. 




Figure 5.17. D-Connector. 



Pin 


Function 


Signal Level 


Impedance 


1 


Y 


0.700V + sync 


75 ohms 


2 


ground - Y 






3 


Pb 


+0.350V 


75 ohms 


4 


ground - Pb 






5 


Pr 


+0.350V 


75 ohms 


6 


ground - Pr 






7 


reserved 1 






8 


line 1 


0V, 2.2V, or 5V 1 


10K +3K ohm 


9 


line 2 


0V, 2.2V, or 5V 1 


10K +3K ohm 


10 


reserved 2 






11 


line 3 


0V, 2.2V, or 5V 1 


10K +3K ohm 


12 


ground - detect plugged 






13 


reserved 3 






14 


detect plugged 


0V = plugged in 2 


> 100K ohm 



Notes : 

1. 2.2V has range of 2.2V ±0.8V. 5V has a range of 5V ±1.5V. 

2. Inside equipment, pin 12 is connected to ground and pin 14 is pulled to 5V through a 
resistor. Inside each D-Connector plug, pins 12 and 14 are shorted together. 



Table 5.17. D-Connector Pin Descriptions. 
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Resolution 


Frame 

Rate 


Line 1 
Scan 
Lines 


Line 2 
Frame 
Rate 


Line 3 
Aspect 
Ratio 


Chromaticity 
and Reference 
White 


Color 

Space 

Equations 


Gamma 

Correction 


Sync 

Amplitude 
on Y 






30i 


5V 


OV 


5V 














25i 2 


5V 


2.2V 


5V 










1920x1080 


30p 


5V 


2.2V 


5V 










25p 2 


5V 


2.2V 


5V 














24p 2 


5V 


2.2V 


5V 














24sF 2 


5V 


2.2V 


5V 


EIA-770.3 


EIA-770.3 


EIA-770.3 


+0.300V 3 






60p 


2.2V 


5V 


5V 














50p 2 


2.2V 


2.2V 


5V 










1280x720 


30p 


2.2V 


2.2V 


5V 














25p 2 


2.2V 


2.2V 


5V 














24p 2 


2.2V 


2.2V 


5V 










640x480 


60p 2 


OV 


5V 


OV 












16:9 

Squeeze 


60p 


OV 


5V 


5V 










720x480 


16:9 

Squeeze 


30i 


OV 


OV 


5V 


EIA-770.2 


EIA-770.2 


EIA-770.2 


-0.300V 3 




16:9 

Letterbox 


30i 


OV 


OV 


2.2V 












4:3 


30i 


OV 


OV 


OV 











Notes : 

1. 60p, 30i, 30p, and 24p frame rates also include the 59.94p, 29.97i, 29.97p, and 23.976p frame rates. 

2. Not part of EIAJ CP-4120 specification, but commonly supported by equipment. 

3. Relative to the blanking level. 



Table 5.18. Voltage Levels of Line Signals for Various Video Formats for D-Connector. 
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Other Pro-Video Analog 
Interfaces 

Tables 5.19 and 5.20 list some other com- 
mon component analog video formats. The 
horizontal and vertical timing is the same as 
for 525-line (M) NTSC and 625-line (B, D, G, 
H, I) PAL. The 100% and 75% color bar values 
are shown in Tables 5.21 through 5.24. The 
SMPTE, EBU N10, 625-line Betacam, and 625- 
line Mil values are the same as for SDTV 
YPbPr. 



VGA Interface 

Table 5.25 and Figure 5.18 illustrate the 15- 
pin VGA connector used by computer equip- 
ment, and some consumer equipment, to trans- 
fer analog RGB signals. The analog RGB 
signals do not contain sync information and 
have no blanking pedestal, as shown in Figure 
5.4. 
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Format 


Output 

Signal 


Signal 

Amplitudes 

(volts) 


Notes 


SMPTE, 
EBU N10 


Y 


+0.700 


0% setup on Y 
100% saturation 

three wire = (Y + sync), (R'-Y), (B'-Y) 


sync 


-0.300 


R'-Y, B'-Y 


+0.350 


525-line 

Betacam 1 


Y 


+0.714 


7.5% setup on Y only 
100% saturation 

three wire = (Y + sync), (R'-Y), (B'-Y) 


sync 


-0.286 


R'-Y, B '-Y 


+0.467 


625-line 

Betacam 1 


Y 


+0.700 


0% setup on Y 
100% saturation 

three wire = (Y + sync), (R'-Y), (B'-Y) 


sync 


-0.300 


R'-Y, B '-Y 


+0.350 


525-line 

Mil 2 


Y 


+0.700 


7.5% setup on Y only 
100% saturation 

three wire = (Y + sync), (R'-Y), (B'-Y) 


sync 


-0.300 


R'-Y, B '-Y 


+0.324 


625-line 

Mil 2 


Y 


+0.700 


0% setup on Y 
100% saturation 

three wire = (Y + sync), (R'-Y), (B'-Y) 


sync 


-0.300 


R'-Y, B '-Y 


+0.350 



Notes : 

1. Trademark of Sony Corporation. 

2. Trademark of Matsushita Corporation. 



Table 5.19. Common Pro-Video Component Analog Video Formats. 
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Format 


Output 

Signal 


Signal 

Amplitudes 

(volts) 


Notes 


SMPTE, 


G',B',R' 


+0.700 


0% setup on G', B', and R' 

100% saturation 

three wire = (G' + sync), B', R' 


EBU N10 


sync 


-0.300 


NTSC 


G',B',R' 


+0.714 


7.5% setup on G', B', and R' 

100% saturation 

three wire = (G' + sync), B', R' 


(setup) 


sync 


-0.286 


NTSC 


G',B',R' 


+0.714 


0% setup on G', B', and R' 

100% saturation 

three wire = (G' + sync), B', R' 


(no setup) 


sync 


-0.286 


Mil 1 


G',B',R' 


+0.700 


7.5% setup on G', B', and R' 

100% saturation 

three wire = (G' + sync), B', R' 


sync 


-0.300 



Notes: 

1. Trademark of Matsushita Corporation. 



Table 5.20. Common Pro-Video RGB Analog Video Formats. 
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White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


100 


89.5 


72.3 


61.8 


45.7 


35.2 


18.0 


7.5 


mV 


700 


626 


506 


433 


320 


246 


126 


53 


B'-Y 


IRE 


0 


-46.3 


15.6 


-30.6 


30.6 


-15.6 


46.3 


0 


mV 


0 


-324 


109 


-214 


214 


-109 


324 


0 


R'-Y 


IRE 


0 


7.5 


-46.3 


-38.7 


38.7 


46.3 


-7.5 


0 


mV 


0 


53 


-324 


-271 


271 


324 


-53 


0 



Table 5.23. 525-Line Mil 100% Color Bars. Values are relative to the blanking level. 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


Y 


IRE 


76.9 


69.0 


56.1 


48.2 


36.2 


28.2 


15.4 


7.5 


mV 


538 


483 


393 


338 


253 


198 


108 


53 


B'-Y 


IRE 


0 


-34.7 


11.7 


-23.0 


23.0 


-11.7 


34.7 


0 


mV 


0 


-243 


82 


-161 


161 


-82 


243 


0 


R'-Y 


IRE 


0 


5.6 


-34.7 


-29.0 


29.0 


34.7 


-5.6 


0 


mV 


0 


39 


-243 


-203 


203 


243 


-39 


0 



Table 5.24. 525-Line Mil 75% Color Bars. Values are relative to the blanking level. 



s O O O O O 1 
i°0 O O O Ob/ 
\«o O O O On/ 



Figure 5.18. VGA 15-Pin D-SUB Female Connector. 
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Pin 


Function 


Signal Level 


Impedance 


1 


red 


0.7v 


75 ohms 


2 


green 


0.7v 


75 ohms 


3 


blue 


0.7v 


75 ohms 


4 


ground 






5 


ground 






6 


ground - red 






7 


ground - green 






8 


ground - blue 






9 


+5VDC 






10 


ground - HSYNC 






11 


ground -VSYNC 






12 


DDC SDA (data) 


> 2.4v 




13 


HSYNC (horizontal sync) 


> 2.4v 




14 


VSYNC (vertical sync) 


> 2.4v 




15 


DDC SCL (clock) 


> 2.4v 





Notes: 

1. DDC = Display Data Channel. 

Table 5.25. VGA Connector Signals. 





Chapter 6 



Digital Video 
Interfaces 



Pro-Video Component 
Interfaces 

Pro-video equipment, such as that used 
within studios, has unique requirements and 
therefore its own set of digital video intercon- 
nect standards. Table 6.1 lists the various pro- 
video parallel and serial digital interface stan- 
dards. 

Video Timing 

Rather than digitize and transmit the 
blanking intervals, special sequences are 
inserted into the digital video stream to indi- 
cate the start of active video (SAV) and end of 
active video (EAV). These EAV and SAV 
sequences indicate when horizontal and verti- 
cal blanking is present and which field is being 
transmitted. They also enable the transmission 
of ancillary data such as digital audio, teletext, 
captioning, etc. during the blanking intervals. 



The EAV and SAV sequences must have 
priority over active video data or ancillary data 
to ensure that correct video timing is always 
maintained at the receiver. The receiver 
decodes the EAV and SAV sequences to 
recover the video timing. 

The video timing sequence of the encoder 
is controlled by three timing signals discussed 
in Chapter 4: H (horizontal blanking) , V (verti- 
cal blanking), and F (Field 1 or Field 2). A 
zero-to-one transition of H triggers an EAV 
sequence while a one-to-zero transition trig- 
gers an SAV sequence. F and V are allowed to 
change only at EAV sequences. 

Usually, both 8-bit and 10-bit interfaces are 
supported, with the 10-bit interface used to 
transmit 2 bits of fractional video data to mini- 
mize cumulative processing errors and to sup- 
port 10-bit ancillary data. 

YCbCr or R G B' data may not use the 10- 
bit values of 0x000-0x003 and 0x3FC-0x3FF, 
or the 8-bit values of 0x00 and OxFF, since they 
are used for timing information. 
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Active 
Resolution 
(H x V) 


Total 

Resolution 1 
(H x V) 


Display 

Aspect 

Ratio 


Frame 

Rate 

(Hz) 


lx Y 
Sample 
Rate 
(MHz) 


SDTV 

or 

HDTV 


Digital 

Parallel 

Standard 


Digital 

Serial 

Standard 


720 x 480i 


858 x 525i 


4:3 


29.97 


13.5 


SDTV 


BT.656 

BT.799 

SMPTE 125M 


BT.656 

BT.799 


720 x 480p 


858 x 525p 


4:3 


59.94 


27 


SDTV 


- 


BT.1362 
SMPTE 294M 


720 x 576i 


864 x 625i 


4:3 


25 


13.5 


SDTV 


BT.656 

BT.799 


BT.656 

BT.799 


720 x 576p 


864 x 625p 


4:3 


50 


27 


SDTV 


- 


BT.1362 


960 x 480i 


1144 x 525i 


16:9 


29.97 


18 


SDTV 


BT.1302 
BT.1303 
SMPTE 267M 


BT.1302 

BT.1303 


960 x 576i 


1152 x 625i 


16:9 


25 


18 


SDTV 


BT.1302 

BT.1303 


BT.1302 

BT.1303 


1280 x 720p 


1650 x 750p 


16:9 


59.94 


74.176 


HDTV 


SMPTE 274M 


- 


1280 x 720p 


1650 x 750p 


16:9 


60 


74.25 


HDTV 


SMPTE 274M 


- 


1920 x 1080i 


2200 x 1125i 


16:9 


29.97 


74.176 


HDTV 


BT.1120 
SMPTE 274M 


BT.1120 
SMPTE 292M 


1920 x 1080i 


2200 x 1125i 


16:9 


30 


74.25 


HDTV 


BT.1120 
SMPTE 274M 


BT.1120 
SMPTE 292M 


1920 x 1080p 


2200 x 1125p 


16:9 


59.94 


148.35 


HDTV 


BT.1120 
SMPTE 274M 


- 


1920 x 1080p 


2200 x 1125p 


16:9 


60 


148.5 


HDTV 


BT.1120 
SMPTE 274M 


- 


1920 x 1080i 


2376 x 1250i 


16:9 


25 


74.25 


HDTV 


BT.1120 


BT.1120 


1920 x 1080p 


2376 x 1250p 


16:9 


50 


148.5 


HDTV 


BT.1120 


- 



Table 6.1. Pro-Video Parallel and Serial Digital Interface Standards for Various 
Component Video Formats. H = interlaced, p = progressive. 
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The EAV and SAY sequences are shown in 
Table 6.2. The status word is defined as: 

F = “0” for Field 1 F = “1” for Field 2 
V = “1” during vertical blanking 
H = “0” at SAV H = “1” at EAV 

P3-P0 = protection bits 

P3 = V © H 
P2 = F © H 
PI = F © V 
P0 = F © V © H 

where ® represents the exclusive-OR function. 
These protection bits enable 1- and 2-bit errors 
to be detected and l-bit errors to be corrected 
at the receiver. For most progressive video sys- 
tems, F is usually a “0” since there is no field 
information. 

For 4:2:2 YCbCr data, after each SAV 
sequence, the stream of active data words 
always begins with a Cb sample, as shown in 
Figure 6.1. In the multiplexed sequence, the 
co-sited samples (those that correspond to the 
same point on the picture) are grouped as Cb, 
Y, Cr. During blanking intervals, unless ancil- 
lary data is present, 10-bit Y or R G B' values 
should be set to 0x040 and 10-bit CbCr values 
should be set to 0x200. 



The receiver detects the EAV and SAV 
sequences by looking for the 8-bit OxFF 0x00 
0x00 preamble. The status word (optionally 
error corrected at the receiver, see Table 6.3) 
is used to recover the H, V, and F timing sig- 
nals. 

Ancillary Data 

Ancillary data packets are used to transmit 
non-video information (such as digital audio, 
closed captioning, teletext, etc.) during the 
blanking intervals. A wide variety of ITU-R and 
SMPTE specifications describe the various 
ancillary data formats. 

During horizontal blanking, ancillary data 
may be transmitted in the interval between the 
EAV and SAV sequences. During vertical 
blanking, ancillary data may be transmitted in 
the interval between the SAV and EAV 
sequences. Multiple ancillary packets may be 
present in a horizontal or vertical blanking 
interval, but they must be contiguous with 
each other. 





8-bit Data 


10-bit Data 


D9 

(MSB) 


D8 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


preamble 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


status word 


1 


F 


V 


H 


P3 


P2 


PI 


P0 


0 


0 



Table 6.2. EAV and SAV Sequence. 





Pro-Video Component Interfaces 109 



Received 

D5-D2 


Received F, V, H (Bits D8-D6) 




000 


001 


010 


Oil 


100 


101 


110 


111 


0000 


000 


000 


000 


* 


000 


* 


* 


111 


0001 


000 


* 


* 


111 


* 


111 


111 


111 


0010 


000 


* 


* 


Oil 


* 


101 


* 


* 


0011 


* 


* 


010 


* 


100 


* 


* 


111 


0100 


000 


* 


* 


Oil 


* 


* 


110 


* 


0101 


* 


001 


* 


* 


100 


* 


* 


111 


0110 


* 


Oil 


Oil 


Oil 


100 


* 


* 


Oil 


0111 


100 


* 


* 


Oil 


100 


100 


100 


* 


1000 


000 


* 


* 


* 


* 


101 


110 


* 


1001 


* 


001 


010 


* 


* 


* 


* 


111 


1010 


* 


101 


010 


* 


101 


101 


* 


101 


1011 


010 


* 


010 


010 


* 


101 


010 


* 


1100 


* 


001 


110 


* 


110 


* 


110 


110 


1101 


001 


001 


* 


001 


* 


001 


110 


* 


1110 


* 


* 


* 


Oil 


* 


101 


110 


* 


1111 


* 


001 


010 


* 


100 


* 


* 


* 



Notes : 

* = uncorrectable error. 

Table 6.3. SAV and EAV Error Correction at Decoder. 



BT.601 H SIGNAL 



START OF DIGITAL LINE 



START OF DIGITAL ACTIVE LINE 



-IU 



c 

R 

718 



Y 

719 



BT.656 

4:2:2 

VIDEO 



268 (280) 



1716 (1728) 



Figure 6.1. BT.656 Parallel Interface Data For One Scan Line. 480i; 4:2:2 YCbCr; 
720 active samples per line; 27 MHz clock; 10-bit system. The values for 576i 
systems are shown in parentheses. 
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There are two types of ancillary data for- 
mats. The older Type 1 format uses a single 
data ID word to indicate the type of ancillary 
data; the newer Type 2 format uses two words 
for the data ID. The general packet format is 
shown in Table 6.4. 

Data ID (DID) 

DID indicates the type of data being sent. 
The assignment of most of the DID values is 
controlled by the ITU and SMPTE to ensure 
equipment compatibility. A few DID values are 
available that don’t require registration. 

Secondary ID (SDID, Type 2 Only) 

SDID is also part of the data ID for Type 2 
ancillary formats. The assignment of most of 
the SDID values is also controlled by the ITU 
and SMPTE to ensure equipment compatibil- 
ity. A few SDID values are available that don’t 
require registration. 

Data Block Number (DBN, Type 1 Only) 

DBN is used to allow multiple ancillary 
packets (sharing the same DID) to be put back 
together at the receiver. This is the case when 
there are more than 255 user data words 
required to be transmitted, thus requiring 
more than one ancillary packet to be used. The 
DBN value increments by one for each consec- 
utive ancillary packet. 

Data Count (DC) 

DC specifies the number of user data 
words in the packet. In 8-bit applications, it 
specifies the six MSBs of an 8-bit value, so the 
number of user data words must be an integral 
number of four. 



User Data Words (UDW) 

Up to 255 user data words may be present 
in the packet. In 8-bit applications, the number 
of user data words must be an integral number 
of four. Padding words may be added to ensure 
an integral number of four user data words are 
present. 

User data may not use the 10-bit values of 
0x000-0x003 and 0x3FC-0x3FF, or the 8-bit 
values of 0x00 and OxFF, since they are used 
for timing information. 

Parallel Interfaces 

25-pin Parallel Interface 

This interface is used to transfer SDTV 
resolution 4:2:2 YCbCr data. 8-bit or 10-bit data 
and a clock are transferred. The individual bits 
are labeled D0-D9, with D9 being the most sig- 
nificant bit. The pin allocations for the signals 
are shown in Table 6.5. 

Y has a nominal 10-bit range of 0x040- 
0x3AC. Values less than 0x040 or greater than 
0x3AC may be present due to processing. Dur- 
ing blanking, Y data should have a value of 
040 h, unless other information is present. 

Cb and Cr have a nominal 10-bit range of 
0x040-0x3C0. Values less than 0x040 or 
greater than 0x3C0 may be present due to pro- 
cessing. During blanking, CbCr data should 
have a value of 0x200, unless other data is 
present. 

Signal levels are compatible with ECF- 
compatible balanced drivers and receivers. 
The generator must have a balanced output 
with a maximum source impedance of 110 Q; 
the signal must be 0.8-2.0V peak-to-peak mea- 
sured across a 110-Q load. At the receiver, the 
transmission line is terminated byllO+lOfi 
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8-bit Data 


10-bit Data 


D9 

(MSB) 


D8 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


ancillary data 
flag (ADF) 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


data ID 
(DID) 


D8 


even 

parity 


value of 0000 0000 to 1111 1111 


data block 
number or SDID 


D8 


even 

parity 


value of 0000 0000 to 1111 1111 


data count 
(DC) 


D8 


even 

parity 


value of 0000 0000 to 1111 1111 


user data word 0 


value of 00 0000 0100 to 11 1111 1011 



user data word N 



value of 00 0000 0100 to 11 1111 1011 



checksum 



D8 



sum of D0-D8 of data ID through last user data word. 
Preset to all zeros; carry is ignored. 



Table 6.4. Ancillary Data Packet General Format. 
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Pin 


Signal 


Pin 


Signal 


1 


clock 


14 


clock- 


2 


system ground A 


15 


system ground B 


3 


D9 


16 


D9- 


4 


D8 


17 


D8- 


5 


D7 


18 


D7- 


6 


D6 


19 


D6- 


7 


D5 


20 


D5- 


8 


D4 


21 


D4- 


9 


D3 


22 


D3- 


10 


D2 


23 


D2- 


11 


D1 


24 


Dl- 


12 


DO 


25 


DO- 


13 


cable shield 





Table 6.5. 25-Pin Parallel Interface Connector Pin 
Assignments. For 8-bit interfaces, D9-D2 are used. 



27 MHz Parallel Interface 

This BT.656 and SMPTE 125M interface is 
used for 480i and 576i systems with an aspect 
ratio of 4:3. Y and multiplexed CbCr informa- 
tion at a sample rate of 13.5 MHz are multi- 
plexed into a single 8-bit or 10-bit data stream, 
at a clock rate of 27 MHz. 

The 27 MHz clock signal has a clock pulse 
width of 18.5 +3 ns. The positive transition of 
the clock signal occurs midway between data 
transitions with a tolerance of +3 ns (as shown 
in Figure 6.2) . 

To permit reliable operation at intercon- 
nect lengths of 50-200 meters, the receiver 
must use frequency equalization, with typical 
characteristics shown in Figure 6.3. This 
example enables operation with a range of 
cable lengths down to zero. 



36 MHz Parallel Interface 

This BT.1302 and SMPTE 267M interface 
is used for 480i and 576i systems with an 
aspect ratio of 16:9. Y and multiplexed CbCr 
information at a sample rate of 18 MHz are 
multiplexed into a single 8-bit or 10-bit data 
stream, at a clock rate of 36 MHz. 

The 36 MHz clock signal has a clock pulse 
width of 13.9 +2 ns. The positive transition of 
the clock signal occurs midway between data 
transitions with a tolerance of +2 ns (as shown 
in Figure 6.4. 

To permit reliable operation at intercon- 
nect lengths of 40-160 meters, the receiver 
must use frequency equalization, with typical 
characteristics shown in Figure 6.3. 
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TW = 18.5 ± 3 NS 
TC = 37 NS 
TD = 18.5 ± 3 NS 



Figure 6.2. 25-Pin 27 MHz Parallel Interface Waveforms. 



RELATIVE GAIN (DB) 




Figure 6.3. Example Line Receiver Equalization 
Characteristics for Small Signals. 
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TW = 13.9 i 2 NS 
TC = 27.8 NS 
TD = 13.9 ± 2 NS 



Figure 6.4. 25-Pin 36 MHz Parallel Interface Waveforms. 



93-pin Parallel Interface 

This interface is used to transfer HDTV 
resolution R G B' data, 4:2:2 YCbCr data, or 
4:2:2:4 YCbCrK data. The pin allocations for 
the signals are shown in Table 6.6. The most 
significant bits are R9, G9, and B9. 

When transferring 4:2:2 YCbCr data, the 
green channel carries Y information and the 
red channel carries multiplexed CbCr informa- 
tion. 

When transferring 4:2:2:4 YCbCrK data, 
the green channel carries Y information, the 
red channel carries multiplexed CbCr informa- 
tion, and the blue channel carries K (alpha key- 
ing) information. 

Y has a nominal 10-bit range of 0x040- 
0x3AC. Values less than 040 H or greater than 
0x3AC may be present due to processing. Dur- 
ing blanking, Y data should have a value of 
0x040, unless other information is present. 

Cb and Cr have a nominal 10-bit range of 
0x040-0x3C0. Values less than 0x040 or 
greater than 0x3C0 may be present due to pro- 
cessing. During blanking, CbCr data should 
have a value of 0x200, unless other information 
is present. 



RGB' and K have a nominal 10-bit range 
of 0x040-0x3AC. Values less than 0x040 or 
greater than 0x3AC may be present due to pro- 
cessing. During blanking, R G B' data should 
have a value of 0x040, unless other information 
is present. 

Signal levels are compatible with ECL- 
compatible balanced drivers and receivers. 
The generator must have a balanced output 
with a maximum source impedance of 110 Q; 
the signal must be 0.6-2.0V peak-to-peak mea- 
sured across a 110-Q load. At the receiver, the 
transmission line must be terminated by 110 
±10 Q. 

74.25 and 74. 1 76 MHz Parallel Interface 

This ITU-R BT.1120 and SMPTE 274M 
interface is primarily used for HDTV systems. 

The 74.25 or 74.176 MHz (74.25/1.001) 
clock signal has a clock pulse width of 6.73 
+1.48 ns. The positive transition of the clock 
signal occurs midway between data transitions 
with a tolerance of +1 ns (as shown in Figure 
6.5). 
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To permit reliable operation at intercon- 
nect lengths greater than 20 meters, the 
receiver must use frequency equalization. 

148.5 and 148.35 MHz Parallel Interface 

This BT.1120 and SMPTE 274M interface 
is used for HDTV systems. 

The 148.5 or 148.35 MHz (148.5/1.001) 
clock signal has a clock pulse width of 3.37 
+0.74 ns. The positive transition of the clock 
signal occurs midway between data transitions 
with a tolerance of +0.5 ns (similar to Figure 
6.5). 

To permit reliable operation at intercon- 
nect lengths greater than 14 meters, the 
receiver must use frequency equalization. 

Applications 

One or more parallel interfaces may be 
used to transfer various video formats between 
equipment. 



4:2:2 YCbCr - Interlaced SDTV 

The 1TU-R BT.656 and BT.1302 parallel 
interfaces were developed to transfer BT.601 
4:2:2 YCbCr digital video between equipment. 
SMPTE 125M and 267M further clarify the 
operation for 480i systems. 

Figure 6.6 illustrates the timing for one 
scan line for the 4:3 aspect ratio, using a 27 
MHz sample clock. Figure 6.7 shows the tim- 
ing for one scan line for the 16:9 aspect ratio, 
using a 36 MHz sample clock. The 25-pin paral- 
lel interface is used. 

4:4:4:4 YCbCrK - Interlaced SDTV 

The ITU-R BT.799 and BT.1303 parallel 
interfaces were developed to transfer BT.601 
4:4:4:4 YCbCrK digital video between equip- 
ment. K is an alpha keying signal, used to mix 
two video sources, discussed in Chapter 7. 
SMPTE RP-175 further clarifies the operation 
for 480i systems. 




TW = 6.73 ± 1.48 NS 
TC = 13.47 NS 
TD = 6.73 ± 1 NS 



Figure 6.5. 93-Pin 74.25 and 74.176 MHz Parallel Interface Waveforms. 



Pro-Video Component Interfaces 117 



BT.601 H SIGNAL 



START OF DIGITAL LINE 



START OF DIGITAL ACTIVE LINE 



-4U- 



C 

R 

718 



Y 

719 



BT.656 

4:2:2 

VIDEO 



268 (280) 



1716 (1728) 



Figure 6.6. BT.656 and SMPTE 125M Parallel Interface Data for One Scan Line. 480i; 
4:2:2 YCbCr; 720 active samples per line; 27 MHz clock; 10-bit system. The values 
for 576i systems are shown in parentheses. 



j 

START OF DIGITAL LINE 



EAV CODE BLANKING 



SAV CODE 



BT.601 H SIGNAL 



START OF DIGITAL ACTIVE LINE 



CO-SITED CO-SITED 



NEXT LINE 




2288 (2304) 



BT.1302 

4:2:2 

VIDEO 



Figure 6.7. BT.1302 and SMPTE 267M Parallel Interface Data for One Scan Line. 
480i; 4:2:2 YCbCr; 960 active samples per line; 36 MHz clock; 10-bit system. The 
values for 576i systems are shown in parentheses. 
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Two transmission links are used. Link A 
contains all the Y samples plus those Cb and 
Cr samples located at even-numbered sample 
points. Link B contains samples from the key- 
ing channel and the Cb and Cr samples from 
the odd-numbered sampled points. Although it 
may be common to refer to L ink A as 4:2:2 and 
Link B as 2:2:4, L ink A is not a true 4:2:2 signal 
since the CbCr data was sampled at 13.5 MHz, 
rather than 6.75 MHz. 

Figure 6.8 shows the contents of links A 
and B when transmitting 4:4:4:4 YCbCrK video 
data. Figure 6.9 illustrates the contents when 
transmitting R'G 'B 'K video data. If the keying 
signal (K) is not present, the K sample values 
should have a 10-bit value of 3AC H . 

Figure 6.10 illustrates the YCbCrK timing 
for one scan line for the 4:3 aspect ratio, using 
a 27 MHz sample clock. Figure 6.11 shows the 
YCbCrK timing for one scan line for the 16:9 
aspect ratio, using a 36 MHz sample clock. 
Two 25-pin parallel interfaces are used. 

RGBK - Interlaced SDTV 

BT.799 and BT.1303 also support transfer- 
ring BT.601 R'G'B'K digital video between 
equipment. For additional information, see the 
4:4:4:4 YCbCrK interface. SMPTE RP-175 fur- 
ther clarifies the operation for 480i systems. 
The G' samples are sent in the Y locations, the 
R' samples are sent in the Cr locations, and the 
B ' samples are sent in the Cb locations. 

4:2:2 YCbCr - Progressive SDTV 

ITU-R BT.1362 defines two 10-bit 4:2:2 
YCbCr data streams (Figure 6.12), using a 27 
MHz sample clock. SMPTE 294M further clar- 
ifies the operation for 480p systems. What 
stream is used for which scan line is shown in 
Table 6.7. 



4:2:2 YCbCr - Interlaced HDTV 

The ITU-R BT.1120 parallel interface was 
developed to transfer interlaced HDTV 4:2:2 
YCbCr digital video between equipment. 
SMPTE 274M further clarifies the operation 
for 29.97 and 30 Hz systems. 

Figure 6.13 illustrates the timing for one 
scan line for the 1920 x 1080i active resolu- 
tions. The 93-pin parallel interface is used with 
a sample clock rate of 74.25 MHz (25 or 30 Hz 
frame rate) or 74.176 MHz (29.97 Hz frame 
rate). 

4:2:2:4 YCbCrK - Interlaced HDTV 

BT.1120 also supports transferring HDTV 
4:2:2:4 YCbCrK digital video between equip- 
ment. SMPTE 274M further clarifies the oper- 
ation for 29.97 and 30 Hz systems. 

Figure 6.14 illustrates the timing for one 
scan line for the 1920 x 1080i active resolu- 
tions. The 93-pin parallel interface is used with 
a sample clock rate of 74.25 MHz (25 or 30 Hz 
frame rate) or 74.176 MHz (29.97 Hz frame 
rate) . 

RGB - Interlaced HDTV 

BT.1120 also supports transferring HDTV 
RGB' digital video between equipment. 
SMPTE 274M further clarifies the operation 
for 29.97 and 30 Hz systems. 

Figure 6.15 illustrates the timing for one 
scan line for the 1920 x 1080i active resolu- 
tions. The 93-pin parallel interface is used with 
a sample clock rate of 74.25 MHz (25 or 30 Hz 
frame rate) or 74.176 MHz (29.97 Hz frame 
rate) . 
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Figure 6.8. Link Content Representation for 
YCbCrK Video Signals. 
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Figure 6.9. Link Content Representation for 
R'G'B'K Video Signals. 




BT.601 H SIGNAL 




Figure 6.10. BT.799 and SMPTE RP-175 Parallel Interface Data for One Scan Line. 480i; 4:4:4:4 
YCbCrK; 720 active samples per line; 27 MHz clock; 10-bit system. The values for 576i systems 
are shown in parentheses. 
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Figure 6.11. BT.1303 Parallel Interface Data for One Scan Line. 480i; 4:4:4:4 YCbCrK; 960 active 
samples per line; 36 MHz clock; 10-bit system. The values for 576i systems are shown in 
parentheses. 
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Figure 6.12. BT.1362 and SMPTE 294M Parallel Data for Two Scan Lines. 480p; 4:2:2 YCbCr; 720 
active samples per line; 27 MHz clock; 10-bit system. The values for 576p systems are shown in 
parentheses. 
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Table 6.7. BT.1362 and SMPTE 294M Scan Line Numbering 
and Link Assignment. 




BT.709, SMPTE 274M H SIGNAL 




Figure 6.13. BT.1120 and SMPTE 274M Parallel Interface Data for One Scan Line. 
1080i29.97, 1080i30. 1080p59.94, and 1080p60 systems; 4:2:2 YCbCr; 1920 active 
samples per line; 74.176, 74.25, 148.35, or 148.5 MHz clock; 10-bit system. The 
values for 1080i25 and 1080p50 systems are shown in parentheses. 
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BT.709, SMPTE 274M H SIGNAL 
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Figure 6.14. BT.1120 and SMPTE 274M Parallel Interface Data for One Scan Line. 
1080129. 97, 1080130, 1080p59.94, and 1080p60 systems; 4:2:2:4 YCbCrK; 1920 
active samples per line; 74.176, 74.25, 148.35, or 148.5 MHz clock; 10-bit system 
The values for 1080125 and 1080p50 systems are shown in parentheses. 
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Figure 6.15. BT.1120 and SMPTE 274M Parallel Interface Data for One Scan Line. 
1080i29.97, 1080130, 1080p59.94, and 1080p60 systems; R'G'B'; 1920 active 
samples per line; 74.176, 74.25, 148.35, or 148.5 MHz clock; 10-bit system. The 
values for 1080i25 and 1080p50 systems are shown in parentheses. 
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4:2:2 YCbCr - Progressive HDTV 

The ITU-R BT.1120 and SMPTE 274M par- 
allel interfaces were developed to transfer pro- 
gressive HDTV 4:2:2 YCbCr digital video 
between equipment. 

Figure 6.13 illustrates the timing for one 
scan line for the 1920 x 1080p active resolu- 
tions. The 93-pin parallel interface is used with 
a sample clock rate of 148.5 MHz (24, 25, 30, 
50, or 60 Hz frame rate) or 148.35 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

Figure 6.16 illustrates the timing for one 
scan line for the 1280 x 720p active resolutions. 
The 93-pin parallel interface is used with a 
sample clock rate of 74.25 MHz (24, 25, 30, 50, 
or 60 Hz frame rate) or 74.176 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

4:2:2:4 YCbCrK - Progressive HDTV 

BT.1120 and SMPTE 274M also support 
transferring HDTV 4:2:2:4 YCbCrK digital 
video between equipment. 

Figure 6.14 illustrates the timing for one 
scan line for the 1920 x 1080p active resolu- 
tions. The 93-pin parallel interface is used with 
a sample clock rate of 148.5 MHz (24, 25, 30, 
50, or 60 Hz frame rate) or 148.35 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

Figure 6.17 illustrates the timing for one 
scan line for the 1280 x 720p active resolutions. 
The 93-pin parallel interface is used with a 
sample clock rate of 74.25 MHz (24, 25, 30, 50, 
or 60 Hz frame rate) or 74.176 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

RGB - Progressive HDTV 

BT.1120 and SMPTE 274M also support 
transferring HDTV RGB' digital video 
between equipment. 

Figure 6.15 illustrates the timing for one 
scan line for the 1920 x 1080p active resolu- 
tions. The 93-pin parallel interface is used with 



a sample clock rate of 148.5 MHz (24, 25, 30, 
50, or 60 Hz frame rate) or 148.35 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

Figure 6.18 illustrates the timing for one 
scan line for the 1280 x 720p active resolutions. 
The 93-pin parallel interface is used with a 
sample clock rate of 74.25 MHz (24, 25, 30, 50, 
or 60 Hz frame rate) or 74.176 MHz (23.98, 

29.97, or 59.94 Hz frame rate) . 

Serial Interfaces 

The parallel formats can be converted to a 
serial format (Figure 6.19), allowing data to be 
transmitted using a 75-Q coaxial cable or opti- 
cal fiber. 

For cable interconnect, the generator has 
an unbalanced output with a source impedance 
of 75 Q; the signal must be 0.8V ±10% peak-to- 
peak measured across a 75-Q load. The 
receiver has an input impedance of 75 Q. 

In an 8-bit environment, before serializa- 
tion, the 0x00 and OxFF codes during EAV and 
SAV are expanded to 10-bit values of 0x000 and 
0x3FF, respectively. All other 8-bit data is 
appended with two least significant “0” bits 
before serialization. 

The 10 bits of data are serialized (FSB 
first) and processed using a scrambled and 
polarity-free NRZI algorithm: 

G(x) = (x 9 + x 4 + 1) (x + 1) 

The input signal to the scrambler (Figure 
6.20) uses positive logic (the highest voltage 
represents a logical one; lowest voltage repre- 
sents a logical zero) . 

The formatted serial data is output at the 
10x sample clock rate. Since the parallel clock 
may contain large amounts of jitter, deriving 
the 10x sample clock directly from an unfil- 
tered parallel clock may result in excessive sig- 
nal jitter. 
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SMPTE 296M H SIGNAL 




Figure 6.16. SMPTE 274M Parallel Interface Data for One Scan Line. 720p59.94 and 
720p60 systems; 4:2:2 YCbCr; 1280 active samples per line; 74.176 or 74.25 MHz 
clock; 10-bit system. The values for 720p50 systems are shown in parentheses. 




SMPTE 296M H SIGNAL 




Figure 6.17. SMPTE 274M Parallel Interface Data for One Scan Line. 720p59.94 and 
720p60 systems; 4:2:2:4 YCbCrK; 1280 active samples per line; 74.176 or 74.25 MHz 
clock; 10-bit system. The values for 720p50 systems are shown in parentheses. 
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Figure 6.18. SMPTE 274M Parallel Interface Data for One Scan Line. 720p59.94 and 
720p60 systems; R'G'B'; 1280 active samples per line; 74.176 or 74.25 MHz clock; 
10-bit system. The values for 720p50 systems are shown in parentheses. 
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Figure 6.19. Serial Interface Block Diagram. 
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Figure 6.20. Typical Scrambler Circuit. 



ENCODED 

DATA 

IN 

(NRZI) 




SERIAL 

DATA 

OUT 

(NRZ) 



Figure 6.21. Typical Descrambler Circuit, 
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At the receiver, phase-lock synchronization 
is done by detecting the EAV and SAV 
sequences. The PLL is continuously adjusted 
slightly each scan line to ensure that these pat- 
terns are detected and to avoid bit slippage. 
The recovered 10x sample clock is divided by 
ten to generate the sample clock, although 
care must be taken not to mask word-related 
jitter components. The serial data is low- and 
high-frequency equalized, inverse scrambling 
performed (Figure 6.21), and deserialized. 

270 Mbps Serial Interface 

This BT.656 and SMPTE 259M interface 
(also called SDI) converts a 27 MHz parallel 
stream into a 270 Mbps serial stream. The 10x 
PLL generates a 270 MHz clock from the 27 
MHz clock signal. This interface is primarily 
used for 480i and 576i 4:3 systems. 

360 Mbps Serial Interface 

This BT.1302 and SMPTE 259M interface 
converts a 36 MHz parallel stream into a 360 
Mbps serial stream. The 10x PLL generates a 



360 MHz clock from the 36 MHz clock signal. 
This interface is primarily used for 480i and 
576i 16:9 systems. 

540 Mbps Serial Interface 

This SMPTE 344M interface converts a 54 
MHz parallel stream, or two 27 MHz parallel 
streams, into a 540 Mbps serial stream. The 
10x PLL generates a 540 MHz clock from the 
54 MHz clock signal. This interface is prima- 
rily used for 480p and 576p 4:3 systems. 

1.485 and 1.4835 Gbps Serial Interface 

This BT.1120 and SMPTE 292M interface 
multiplexes two 74.25 or 74.176 (74.25/1.001) 
MHz parallel streams (Y and CbCr) into a sin- 
gle 1.485 or 1.4835 Gbps serial stream. A 20x 
PLL generates a 1.485 GHz clock from the 
74.25 or 74.176 MHz clock signal. This inter- 
face is used for HDTV systems. 

Before multiplexing the two parallel 
streams together, line number and CRC infor- 
mation (Table 6.8) is added to each stream 
after each EAV sequence. The CRC is used to 





D9 

(MSB) 


D8 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


LNO 


D8 


L6 


L5 


L4 


L3 


L2 


LI 


L0 


0 


0 


LN1 


D8 


0 


0 


0 


L10 


L9 


L8 


L7 


0 


0 


CRCO 


D8 


crc8 


crc7 


crc6 


crc5 


crc4 


crc3 


crc2 


crcl 


crcO 


CRC1 


D8 


crc!7 


crc!6 


crc!5 


crc!4 


crc!3 


crc!2 


crcll 


crclO 


crc9 



Table 6.8. Line Number and CRC Data. 
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detect errors in the active video and EAV. It 
consists of two words generated by the polyno- 
mial: 

CRC = x 18 + x 5 + x 4 + 1 

The initial value is set to zero. The calculation 
starts with the first active line word and ends at 
the last word of the line number (LN 1) . 

Applications 

One or more serial interfaces may be used 
to transfer various video formats between 
equipment. 

RGBK - Interlaced SDTV 

BT.799 and BT.1303 also define a R'G'B'K 
serial interface. The two 10-bit R'G'B 'K paral- 
lel streams in Figure 6.10 are serialized using 
two 270 or 360 Mbps serial interfaces. 

4:2:2 YCbCr - Progressive SDTV 

ITU-R BT.1362 and SMPTE 294M also 
define a 4:2:2 YCbCr serial interface. The two 
10-bit 4:2:2 YCbCr parallel streams in Figure 

6.12 are serialized using two 270 Mbps serial 
interfaces. 

4:2:2 YCbCr - Interlaced HDTV 

BT.1120 and SMPTE 292M also define a 
4:2:2 YCbCr serial interface. The two 10-bit 
4:2:2 YCbCr parallel streams shown in Figure 

6.13 are multiplexed together, then serialized 
using a 1.485 or 1.4835 Gbps serial interface. 

Pro-Video Composite 
Interfaces 

Digital composite video is essentially a dig- 
ital version of a composite analog (M) NTSC or 



(B, D, G, H, I) PAL video signal. The sample 
clock rate is four times F sc : about 14.32 MHz 
for (M) NTSC and about 17.73 MHz for (B, D, 
G, H, I) PAL. 

Usually, both 8-bit and 10-bit interfaces are 
supported, with the 10-bit interface used to 
transmit 2 bits of fractional video data to mini- 
mize cumulative processing errors and to sup- 
port 10-bit ancillary data. 

Table 6.9 lists the digital composite levels. 
Video data may not use the 10-bit values of 
0x000-0x003 and 0x3FC-0x3FF, or the 8-bit 
values of 0x00 and OxFF, since they are used 
for timing information. 

NTSC Video Timing 

There are 910 total samples per scan line, 
as shown in Figure 6.22. Horizontal count 0 
corresponds to the start of active video, and a 
horizontal count of 768 corresponds to the 
start of horizontal blanking. 

Sampling is along the +1 and +Q axes (33°, 
123°, 213°, and 303°). The sampling phase at 
horizontal count 0 of line 10, Field 1 is on the +1 
axis (123°). 

The sync edge values, and the horizontal 
counts at which they occur, are defined as 
shown in Figure 6.23 and Tables 6.10-6.12. 8- 
bit values for one color burst cycle are 45, 83, 
75, and 37. The burst envelope starts at hori- 
zontal count 857, and lasts for 43 clock cycles, 
as shown in Table 6.10. Note that the peak 
amplitudes of the burst are not sampled. 

To maintain zero SCH phase, horizontal 
count 784 occurs 25.6 ns (33° of the subcarrier 
phase) before the 50% point of the falling edge 
of horizontal sync, and horizontal count 785 
occurs 44.2 ns (57° of the subcarrier phase) 
after the 50% point of the falling edge of hori- 
zontal sync. 
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PAL Video Timing 

There are 1135 total samples per line, 
except for two lines per frame which have 1137 
samples per line, making a total of 709,379 
samples per frame. Figure 6.24 illustrates the 
typical line timing. Horizontal count 0 corre- 
sponds to the start of active video, and a hori- 
zontal count of 948 corresponds to the start of 
horizontal blanking. 

Sampling is along the +U and +V axes (0°, 
90°, 180°, and 270°), with the sampling phase 
at horizontal count 0 of line 1, Field 1 on the +V 
axis (90°). 

8-bit color burst values are 95, 64, 32, and 
64, continuously repeated. The swinging burst 
causes the peak burst (32 and 95) and zero 
burst (64) samples to change places. The burst 
envelope starts at horizontal count 1058, and 
lasts for 40 clock cycles. 

Sampling is not H-coherent as with (M) 
NTSC, so the position of the sync pulses 
changes from line to line. Zero SCH phase is 
defined when alternate burst samples have a 
value of 64. 



Ancillary Data 

Ancillary data packets are used to transmit 
information (such as digital audio, closed cap- 
tioning, and teletext data) during the blanking 
intervals. ITU-R BT.1364 and SMPTE 291M 
describe the ancillary data formats. 

The ancillary data formats are the same as 
for digital component video, discussed earlier 
in this chapter. However, instead of a 3-word 
preamble, a one-word ancillary data flag is 
used, with a 10-bit value of 3FC H . There may 
be multiple ancillary data flags following the 
TRS-ID, with each flag identifying the begin- 
ning of another ancillary packet. 

Ancillary data may be present within the 
following word number boundaries (see Fig- 
ures 6.25 through 6.30) . 



NTSC 


PAL 




795-849 


972-1035 


horizontal sync period 


795-815 


972-994 


equalizing pulse periods 


340-360 


404-426 




795-260 


972-302 


vertical sync periods 


340-715 


404-869 





Video 

Level 


(M) 

NTSC 


(B, D, G, H, 1) 

PAL 


peak chroma 


972 


1040 

(limited to 1023) 


white 


800 


844 


peak burst 


352 


380 


black 


280 


256 


blank 


240 


256 


peak burst 


128 


128 


peak chroma 


104 


128 


sync 


16 


4 



Table 6.9. 10-Bit Video Levels for Digital Composite Video Signals. 
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Figure 6.22. Digital Composite (M) NTSC Analog and Digital Timing Relationship. 



END OF END OF 
ANALOG DIGITAL 

LINE LINE 
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Figure 6.23. Digital Composite (M) NTSC Sync Timing. The horizontal counts are shown 
with the corresponding 8-bit sample values in parentheses. 
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Sample 


8-bit 

Hex Value 


10-bit 
Hex Value 


Fields 1, 3 


Fields 2, 4 


Fields 1, 3 


Fields 2, 4 


768-782 


3C 


3C 


0F0 


0F0 


783 


3A 


3A 


0E9 


0E9 


784 


29 


29 


0A4 


0A4 


785 


11 


11 


044 


044 


786 


04 


04 


Oil 


Oil 


787-849 


04 


04 


010 


010 


850 


06 


06 


017 


017 


851 


17 


17 


05C 


05C 


852 


2F 


2F 


0BC 


OBC 


853 


3C 


3C 


0EF 


OEF 


854-856 


3C 


3C 


0F0 


0F0 


857 


3C 


3C 


0F0 


0F0 


858 


3D 


3B 


0F4 


OEC 


859 


37 


41 


ODC 


104 


860 


36 


42 


0D6 


10A 


861 


4B 


2D 


12C 


0B4 


862 


49 


2F 


123 


OBD 


863 


25 


53 


096 


14A 


864 


2D 


4B 


0B3 


12D 


865 


53 


25 


14E 


092 


866 


4B 


2D 


12D 


0B3 


867 


25 


53 


092 


14E 


868 


2D 


4B 


0B3 


12D 


869 


53 


25 


14E 


092 


870 


4B 


2D 


12D 


0B3 


871 


25 


53 


092 


14E 


872 


2D 


4B 


0B3 


12D 


873 


53 


25 


14E 


092 



Table 6.10a. Digital Values During the Horizontal Blanking Intervals for Digital 
Composite (M) NTSC Video Signals. 
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Sample 


8-bit 

Hex Value 


10-bit 
Hex Value 


Fields 1, 3 


Fields 2, 4 


Fields 1, 3 


Fields 2, 4 


874 


4B 


2D 


12D 


0B3 


875 


25 


53 


092 


14E 


876 


2D 


4B 


0B3 


12D 


877 


53 


25 


14E 


092 


878 


4B 


2D 


12D 


0B3 


879 


25 


53 


092 


14E 


880 


2D 


4B 


0B3 


12D 


881 


53 


25 


14E 


092 


882 


4B 


2D 


12D 


0B3 


883 


25 


53 


092 


14E 


884 


2D 


4B 


0B3 


12D 


885 


53 


25 


14E 


092 


886 


4B 


2D 


12D 


0B3 


887 


25 


53 


092 


14E 


888 


2D 


4B 


0B3 


12D 


889 


53 


25 


14E 


092 


890 


4B 


2D 


12D 


0B3 


891 


25 


53 


092 


14E 


892 


2D 


4B 


0B3 


12D 


893 


53 


25 


14E 


092 


894 


4A 


2E 


129 


0B7 


895 


2A 


4E 


0A6 


13A 


896 


33 


45 


0CD 


113 


897 


44 


34 


112 


0CE 


898 


3F 


39 


0FA 


0E6 


899 


3B 


3D 


0EC 


0F4 


900-909 


3C 


3C 


0F0 


0F0 



Table 6.10b. Digital Values During the Horizontal Blanking Intervals for Digital 
Composite (M) NTSC Video Signals. 
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Fields 1, 3 


Fields 2, 4 


Sample 


8-bit 


10-bit 


Sample 


8-bit 


10-bit 


Hex Value 


Hex Value 


Hex Value 


Hex Value 


768-782 


3C 


0F0 


313-327 


3C 


0F0 


783 


3A 


0E9 


328 


3A 


0E9 


784 


29 


0A4 


329 


29 


0A4 


785 


11 


044 


330 


11 


044 


786 


04 


Oil 


331 


04 


Oil 


787-815 


04 


010 


332-360 


04 


010 


816 


06 


017 


361 


06 


017 


817 


17 


05C 


362 


17 


05C 


818 


2F 


0BC 


363 


2F 


OBC 


819 


3C 


OEF 


364 


3C 


OEF 


820-327 


3C 


0F0 


365-782 


3C 


0F0 


328 


3A 


0E9 


783 


3A 


0E9 


329 


29 


0A4 


784 


29 


0A4 


330 


11 


044 


785 


11 


044 


331 


04 


Oil 


786 


04 


Oil 


332-360 


04 


010 


787-815 


04 


010 


361 


06 


017 


816 


06 


017 


362 


17 


05C 


817 


17 


05C 


363 


2F 


OBC 


818 


2F 


OBC 


364 


3C 


OEF 


819 


3C 


OEF 


365-782 


3C 


0F0 


820-327 


3C 


0F0 



Table 6.11. Equalizing Pulse Values During the Vertical Blanking Intervals for Digital 
Composite (M) NTSC Video Signals. 
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Fields 1 , 3 


Fields 2, 4 


Sample 


8-bit 


10-bit 


Sample 


8-bit 


10-bit 


Hex Value 


Hex Value 


Hex Value 


Hex Value 


782 


3C 


0F0 


327 


3C 


0F0 


783 


3A 


0E9 


328 


3A 


0E9 


784 


29 


0A4 


329 


29 


0A4 


785 


11 


044 


330 


11 


044 


786 


04 


Oil 


331 


04 


Oil 


787-260 


04 


010 


332-715 


04 


010 


261 


06 


017 


716 


06 


017 


262 


17 


05C 


717 


17 


05C 


263 


2F 


0BC 


718 


2F 


OBC 


264 


3C 


0EF 


719 


3C 


OEF 


265-327 


3C 


0F0 


720-782 


3C 


0F0 


328 


3A 


0E9 


783 


3A 


0E9 


329 


29 


0A4 


784 


29 


0A4 


330 


11 


044 


785 


11 


044 


331 


04 


Oil 


786 


04 


Oil 


332-715 


04 


010 


787-260 


04 


010 


716 


06 


017 


261 


06 


017 


717 


17 


05C 


262 


17 


05C 


718 


2F 


OBC 


263 


2F 


OBC 


719 


3C 


OEF 


264 


3C 


OEF 


720-782 


3C 


0F0 


265-327 


3C 


0F0 



Table 6.12. Serration Pulse Values During the Vertical Blanking Intervals for Digital 
Composite (M) NTSC Video Signals. 
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User data may not use the 10-bit values of 
0x000-0x003 and 0x3FC-0x3FF, or the 8-bit 
values of 0x00 and OxFF, since they are used 
for timing information. 

Parallel Interface 

The SMPTE 244M 25-pin parallel interface 
is based on that used for 27 MHz 4:2:2 digital 
component video (Table 6.5), except for the 
timing differences. This interface is used to 
transfer SDTV resolution digital composite 
data. 8-bit or 10-bit data and a 4x Fgc clock are 
transferred. 

Signal levels are compatible with ECL- 
compatible balanced drivers and receivers. 
The generator must have a balanced output 



with a maximum source impedance of 110 Q; 
the signal must be 0.8-2.0V peak-to-peak mea- 
sured across a 110-Q load. At the receiver, the 
transmission line must be terminated by 110 

±10 n . 

The clock signal is a 4x F sc square wave, 
with a clock pulse width of 35 +5 ns for (M) 
NTSC or 28 +5 ns for (B, D, G, H, I) PAL. The 
positive transition of the clock signal occurs 
midway between data transitions with a toler- 
ance of +5 ns (as shown in Figure 6.31) . 

To permit reliable operation at intercon- 
nect lengths of 50-200 meters, the receiver 
must use frequency equalization, with typical 
characteristics shown in Figure 6.3. This 
example enables operation with a range of 
cable lengths down to zero. 





DIGITAL 

BLANKING 


DIGITAL ACTIVE LINE 


187 SAMPLES 


948 SAMPLES 


(948-1134) 


(0-947) 




TOTAL LINE 




1135 SAMPLES 




(0-1134) 



Figure 6.24. Digital Composite (B, D, G, H, I) PAL Analog and Digital Timing Relationship. 
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Serial Interface 

The parallel format can be converted to a 
SMPTE 259M serial format (Figure 6.32), 
allowing data to be transmitted using a 75-Q 
coaxial cable (or optical fiber). This interface 
converts the 14.32 or 17.73 MHz parallel 
stream into a 143 or 177 Mbps serial stream. 
The 10x PLL generates the 143 or 177 MHz 
clock from the 14.32 or 17.73 MHz clock sig- 
nal. 

For cable interconnect, the generator has 
an unbalanced output with a source impedance 
of 75 Q; the signal must be 0.8V +10% peak-to- 
peak measured across a 75-Q load. The 
receiver has an input impedance of 75 Q 

The 10 bits of data are serialized (LSB 
first) and processed using a scrambled and 
polarity-free NRZI algorithm: 



END OF END OF 
ANALOG DIGITAL 

LINE LINE 




787 



G(x) = (x 9 + x 4 + 1) (x + 1) 

This algorithm is the same as used for digital 
component video discussed earlier. In an 8-bit 
environment, 8-bit data is appended with two 
least significant “0” bits before serialization. 

The input signal to the scrambler (Figure 
6.20) uses positive logic (the highest voltage 
represents a logical one; lowest voltage repre- 
sents a logical zero) . The formatted serial data 
is output at the 40x Fg C rate. 

At the receiver, phase-lock synchronization 
is done by detecting the TRS-ID sequences. 
The PLL is continuously adjusted slightly each 
scan line to ensure that these patterns are 
detected and to avoid bit slippage. The recov- 
ered 10x clock is divided by ten to generate the 
4x F§c sample clock. The serial data is low- 
and high-frequency equalized, inverse scram- 
bling performed (Figure 6.21), and deserial- 
ized. 




790-794 795-849 

TRS-ID ANC DATA 
(OPTIONAL) 



Figure 6.25. (M) NTSC TRS-ID and Ancillary Data Locations During Horizontal Sync Intervals. 
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TRS-ID ANC DATA ANC DATA 

(OPTIONAL) (OPTIONAL) 



Figure 6.26. (M) NTSC TRS-ID and Ancillary Data Locations During Vertical Sync Intervals. 





TRS-ID ANC DATA ANC DATA 

(OPTIONAL) (OPTIONAL) 



Figure 6.27. (M) NTSC TRS-ID and Ancillary Data Locations During Equalizing Pulse Intervals. 
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END OF END OF 
ANALOG DIGITAL 

LINE LINE 




967-971 972-1035 





TRS-ID ANC DATA 
(OPTIONAL) 



Figure 6.28. (B, D, G, H, I) PAL TRS-ID and Ancillary Data Locations During Horizontal 
Sync Intervals. 




50 % 



404-869 



TRS-ID ANC DATA 
(OPTIONAL) 



ANC DATA 
(OPTIONAL) 





Figure 6.29. (B, D, G, H, I) PAL TRS-ID and Ancillary Data Locations During Vertical Sync Intervals. 
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50 % 



404-426 



TRS-ID ANC DATA 
(OPTIONAL) 



ANC DATA 
(OPTIONAL) 





Figure 6.30. (B, D, G, H, I) PAL TRS-ID and Ancillary Data Locations During Equalizing 
Pulse Intervals. 



TRS-ID 

When using the serial interface, a special 
five-word sequence, known as the TRS-ID, 
must be inserted into the digital video stream 
during the horizontal sync time. The TRS-ID is 
present only following sync leading edges 
which identify a horizontal transition, and 
occupies horizontal counts 790-794, inclusive 
(NTSC) or 967-971, inclusive (PAL). Table 
6.13 shows the TRS-ID format; Figures 6.25 
through 6.30 show the TRS-ID locations for 
digital composite (M) NTSC and (B, D, G, H, 
I) PAL video signals. 

The line number ID word at horizontal 
count 794 (NTSC) or 971 (PAL) is defined as 
shown in Table 6.14. 



PAL requires the reset of the TRS-ID posi- 
tion relative to horizontal sync once per field 
on only one of lines 625-4 and 313-317 due to 
the 25 Hz offset. All lines have 1135 samples 
except the two lines used for reset, which have 
1137 samples. The two additional samples are 
numbered 1135 and 1136, and occur just prior 
to the first active picture sample (sample 0) . 

Due to the 25 Hz offset, the samples occur 
slightly earlier each line. Initial determination 
of the TRS-ID position should be done on line 
1, Field 1, or a nearby line. The TRS-ID loca- 
tion always starts at sample 967, but the dis- 
tance from the leading edge of sync varies due 
to the 25 Hz offset. 
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CLOCK 



DATA 




TW = 35 ± 5 NS (M) NTSC; 28 ± 5 NS (B, D, G, H, I) PAL 
TC = 69.84 NS (M) NTSC; 56.39 NS (B, D, G, H, I) PAL 
TD = 35 ± 5 NS (M) NTSC; 28 ± 5 NS (B, D, G, H, I) PAL 



Figure 6.31. Digital Composite Video Parallel Interface Waveforms. 



10-BIT 

DIGITAL 

COMPOSITE 

VIDEO 



4X FSC 
CLOCK 




10-BIT 

DIGITAL 

COMPOSITE 

VIDEO 



4X FSC 
CLOCK 



Figure 6.32. Serial Interface Block Diagram 
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D9 

(MSB) 


D8 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


TRS word 0 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


TRS word 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


TRS word 2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


TRS word 3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


line number ID 


D8 


EP 


line number ID 



Note : 

EP = even parity for D0-D7. 



Table 6.13. TRS-ID Format. 



D2 


D1 


DO 


(M) NTSC 


(B, D, G, H, 1) PAL 


0 


0 


0 


line 1-263 field 1 


line 1-313 field 1 


0 


0 


1 


line 264-525 field 2 


line 314-625 field 2 


0 


1 


0 


line 1-263 field 3 


line 1-313 field 3 


0 


1 


1 


line 264-525 field 4 


line 314-625 field 4 


1 


0 


0 


not used 


line 1-313 field 5 


1 


0 


1 


not used 


line 314-625 field 6 


1 


1 


0 


not used 


line 1-313 field 7 


1 


1 


1 


not used 


line 314-625 field 8 



D7-D3 


(M) NTSC 


(B, D, G, H, 1) PAL 


1 < x < 30 


line number 1-30 [264-293] 


line number 1-30 [314-343] 


x = 31 


line number > 31 [294] 


line number > 31 [344] 


x = 0 


not used 


not used 



Table 6.14. Line Number ID Word at Horizontal Count 794 (NTSC) or 971 (PAL) 
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Pro-Video Transport 
Interfaces 

Serial Data Transport Interface (SDTI) 

SMPTE 305M and ITU-R BT.1381 define a 
Serial Data Transport Interface (SDTI) that 
enables transferring data between equipment. 
The physical layer uses the 270 or 360 Mbps 
BT.656, BT.1302, and SMPTE 259M digital 
component video serial interface. Figure 6.33 
illustrates the signal format. 

A 53-word header is inserted immediately 
after the EAV sequence, specifying the source, 
destination, and data format. Table 6.15 illus- 
trates the header contents. 

The payload data is defined within BT.1381 
and by other application-specific standards 
such as SMPTE 326M. It may consist of 
MPEG-2 program or transport streams, DV 
streams, etc., and uses either 8-bit words plus 
even parity and D8, or 9-bit words plus D8. 

Line Number 

The line number specifies a value of 1-525 
(480i systems) or 1-625 (576i systems). L0 is 
the least significant bit. 

Line Number CRC 

The line number CRC applies to the data 
ID through the line number, for the entire 10 
bits. CO is the least significant bit. It is an 18-bit 
value, with an initial value set to all ones: 

CRC = x 18 + x 5 + x 4 + x 1 



Code and AAI 

The Tbit code value (CD3-CD0) specifies 
the length of the payload (the user data con- 
tained between the SAV and EAV sequences) : 

0000 4:2:2 YCbCr video data 

0001 1440 word payload 
(uses 270 Mbps interface) 

0010 1920 word payload 

(uses 360 Mbps interface) 

1000 143 Mbps digital composite video 

The Tbit authorized address identifier (AAI) 
value, AAI3-AAI0, specifies the format of the 
destination and source addresses: 

0000 unspecified format 

0001 IPv6 address 



Destination and Source Addresses 

These specify the address of the source 
and destination devices. A universal address is 
indicated when all address bits are zero and 
AAI3-AAI0 = 0000. 

Block Type 

The block type value specifies the segmen- 
tation of the payload. BL7-BL6 indicate the 
payload block structure: 

00 fixed block size without ECC 

01 fixed block size with ECC 

10 unassigned 

11 variable block size 

BL5-BL0 indicate the segmentation for fixed 
block sizes. Variable block sizes are indicated 
by BL7-BL0 having a value of 11000001. The 
ECC format is application-dependent. 
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Figure 6.33. SDTI Signal Format. 



Payload CRC Flag 

The CRCF bit indicates whether or not the 
payload CRC is present at the end of the pay- 
load: 

0 no CRC 

1 CRC present 



Header CRC 

The header CRC applies to the code and 
AAI word through the last reserved data word, 
for the entire 10 bits. CO is the least significant 
bit. It is an 18-bit value, with an initial value set 
to all ones: 

CRC = x 18 + x 5 + x 4 + x 1 



High Data-Rate Serial Data Transport 
Interface (HD-SDTI) 

SMPTE 348M and ITU-R BT.1577 define a 
High Data-Rate Serial Data Transport Inter- 
face (HD-SDTI) that enables transferring data 
between equipment. The physical layer uses 
the 1.485 (or 1.485/1.001) Gbps SMPTE 292M 
digital component video serial interface. 

Figure 6.34 illustrates the signal format. 
Two data channels are multiplexed onto the 
single HD-SDTI stream such that one 74.25 (or 



74.25/1.001) MHz data stream occupies the Y 
data space and the other 74.25 (or 74.25/1.001) 
MHz data stream occupies the CbCr data 
space. 

A 49-word header is inserted immediately 
after the line number CRC data, specifying the 
source, destination, and data format. Table 
6.16 illustrates the header contents. 

The payload data is defined by other appli- 
cation-specific standards. It may consist of 
MPEG-2 program or transport streams, DV 
streams, etc., and uses either 8-bit words plus 
even parity and D8, or 9-bit words plus D8. 

Code and AAI 

The 4-bit code value (CD3-CD0) specifies 
the length of the payload (the user data con- 
tained between the SAV and EAV sequences) : 

0000 4:2:2 YCbCr video data 

0001 1440 word payload 

0010 1920 word payload 

0011 1280 word payload 

1000 143 Mbps digital composite video 

1001 2304 word payload (extended mode) 

1010 2400 word payload (extended mode) 

1011 1440 word payload (extended mode) 

1100 1728 word payload (extended mode) 

1101 2880 word payload (extended mode) 

1110 3456 word payload (extended mode) 

1111 3600 word payload (extended mode) 
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10-bit Data 


D9 

(MSB) 


D8 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


ancillary data 
flag (ADF) 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


data ID (DID) 


D8 


EP 


0 


1 


0 


0 


0 


0 


0 


0 


SDID 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


1 


data count (DC) 


D8 


EP 


0 


0 


1 


0 


1 


1 


1 


0 


line number 


D8 


EP 


L7 


L6 


L5 


L4 


L3 


L2 


LI 


LO 


D8 


EP 


0 


0 


0 


0 


0 


0 


L9 


L8 


line number 
CRC 


D8 


C8 


C7 


C6 


C5 


C4 


C3 


C2 


Cl 


CO 


D8 


C17 


C16 


C15 


CW 


C13 


C12 


Cll 


CIO 


C9 


code and AAI 


D8 


EP 


AAI3 


AAI2 


AAI1 


AAIO 


CD3 


CD2 


CD1 


CDO 


destination 

address 


D8 


EP 


DA7 


DA6 


DA5 


DA4 


DA3 


DA2 


DAI 


DAO 


D8 


EP 


DA15 


DA14 


DA13 


DA12 


DA11 


DA10 


DA9 


DA8 




D8 


EP 


DA127 


DA126 


DA125 


DA124 


DA123 


DA122 


DA121 


DA120 


source 

address 


D8 


EP 


SA7 


SA6 


SA5 


SA4 


SA3 


SA2 


SA1 


SAO 


D8 


EP 


SA15 


SAW 


SA13 


SA12 


SA11 


SA10 


SA9 


SA8 




D8 


EP 


SA127 


SA126 


SA125 


SA124 


SA123 


SA122 


SA121 


SA120 



Note: 

EP = even parity for D0-D7. 



Table 6.15a. SDTI Header Structure, 
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10-bit Data 
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DO 
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D8 


EP 
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BL5 


BL4 
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BL2 


BL1 


BLO 


payload CRC flag 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


CRCF 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


header CRC 


D8 


C8 


C7 


C6 


C5 


C4 


C3 


C2 


Cl 


CO 


D8 


C17 


C16 


C15 


C14 


C13 


C12 


Cll 


CIO 


C9 


checksum 


D8 


Sum of D0-D8 of data ID through last header CRC word. 
Preset to all zeros; carry is ignored. 



Note : 

EP = even parity for D0-D7. 



Table 6.15b. SDTI Header Structure (continued). 




C CHANNEL 




Y CHANNEL 



Figure 6.34. HD-SDTI Signal Format. LN = line number (two 10-bit words), CRC = line number 
CRC (two 10-bit words). 
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10-bit Data 
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0 
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1 


1 


1 
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1 
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1 
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D8 
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1 


0 


0 


0 


0 


0 


0 
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D8 


EP 


0 


0 


0 


0 


0 


0 


1 


0 


data count (DC) 


D8 


EP 


0 


0 


1 


0 


1 


0 


1 


0 


code and AAI 


D8 


EP 


AAI3 


AAI2 


AAI1 


AAIO 


CD3 


CD2 


CD1 


CDO 




D8 


EP 


DA7 


DA6 


DA5 


DA4 


DA3 


DA2 


DAI 


DAO 


destination 


D8 


EP 


DA15 


DA14 


DA13 


DA12 


DA11 


DA10 


DA9 


DA8 


address 


: 




D8 


EP 


DA127 


DA126 


DA125 


DA124 


DA123 


DA122 


DA121 


DA120 




D8 


EP 


SA7 


SA6 


SA5 


SA4 


SA3 


SA2 


SA1 


SAO 


source 


D8 


EP 


SA15 


SAW 


SA13 


SA12 


SA11 


SA10 


SA9 


SA8 


address 


: 




D8 


EP 


SA127 


SA126 


SA125 


SA124 


SA123 


SA122 


SA121 


SA120 


block type 


D8 


EP 


BL7 


BL6 


BL5 


BL4 


BL3 


BL2 


BL1 


BLO 


payload CRC flag 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 



Note : 

EP = even parity for D0-D7. 



Table 6.16a. HD-SDTI Header Structure, 
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10-bit Data 
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D7 


D6 
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D1 


DO 


reserved 


D8 


EP 
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0 


reserved 


D8 


EP 
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0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


reserved 


D8 


EP 


0 


0 


0 


0 


0 


0 


0 


0 


header CRC 


D8 


C8 


C7 


C6 


C5 


C4 


C3 


C2 


Cl 


CO 


D8 


C17 


C16 


C15 


C14 


C13 


C12 


Cll 


CIO 


C9 


checksum 


D8 


Sum of D0-D8 of data ID through last header CRC word. 
Preset to all zeros; carry is ignored. 



Note\ 

EP = even parity for D0-D7. 



Table 6.16b. HD-SDTI Header Structure (continued). 



The extended mode advances the timing of 
the SAV sequence, shortening the blanking 
interval, so that the payload data rate remains 
a constant 129.6 (or 129.6/1.001) MBps. 

The Tbit authorized address identifier 
(AAI) format is the same as for SDTI. 

Destination and Source Addresses 

The source and destination address for- 
mats are the same as for SDTI. 



Block Type 

The block type format is the same as for 
SDTI. 

Header CRC 

The header CRC applies to the DID 
through the last reserved data word, for the 
entire 10 bits. CO is the least significant bit. It is 
an 18-bit value, with an initial value set to all 
ones: 



CRC = x 18 + x 5 + x 4 + x 1 
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1C Component Interfaces 

Many solutions for transferring digital 
video between chips are derived from the pro- 
video interconnect standards. Chips for the 
pro-video market typically support 10 or 12 bits 
of data per video component, while chips for 
the consumer market typically use 8 bits of 
data per video component. BT.601 and BT.656 
are the most popular interfaces. 

YCbCr Values: 8-bit Data 

Y has a nominal range of OxlO-OxEB. Val- 
ues less than 10 H or greater than 0xEB H may 
be present due to processing. Cb and Cr have a 
nominal range of OxlO-OxFO. Values less than 
0x10 or greater than OxFO may be present due 
to processing. YCbCr data may not use the val- 
ues of 00^ and FF H since those values may be 
used for timing information. 

During blanking, Y data should have a 
value of 0x10 and CbCr data should have a 
value of 0x80, unless other information is 
present. 

YCbCr Values: 10-bit Data 

For higher accuracy, pro-video solutions 
typically use 10-bit YCbCr data. Y has a nomi- 
nal range of 0x040-0x3AC. Values less than 
0x040 or greater than 0x3AC may be present 
due to processing. Cb and Cr have a nominal 
range of 040 h- 3C0 h . Values less than 0x040 or 
greater than 0x3C0 may be present due to pro- 
cessing. The values 0x000-0x003 and 0x3FC- 
0x3FF may not be used to avoid timing conten- 
tion with 8-bit systems. 

During blanking, Y data should have a 
value of 0x040 and CbCr data should have a 
value of 0x200, unless other information is 
present. 



RGB Values: 8-bit Data 

Consumer solutions typically use 8-bit 
R G B' data, with a range of OxlO-OxEB (note 
that PCs typically use a range of OxOO-OxFF) . 
Values less than 0x10 or greater than OxEB 
may be present due to processing. 

During blanking, R G B' data should have 
a value of 0x10, unless other information is 
present. 

RGB Values: 10-bit Data 

For higher accuracy, pro-video solutions 
typically use 10-bit R G B' data, with a nominal 
range of 0x040-0x3AC. Values less than 0x040 
or greater than 0 x3ACjj may be present due to 
processing. The values 0x000-0x003 and 
0x3FC-0x3FF may not be used to avoid timing 
contention with 8-bit systems. 

During blanking, R G B' data should have 
a value of 0x040, unless other data is present. 

BT.601 Video Interface 

The BT.601 video interface has been used 
for years, with the control signal names and 
timing reflecting the video standard. Sup- 
ported active resolutions and sample clock 
rates are dependent on the video standard and 
aspect ratio. 

Devices usually support multiple data for- 
mats to simplify using them in a wide variety of 
applications. 

Video Data Formats 

The 24-bit 4:4:4 YCbCr data format is 
shown in Figure 6.35. Y, Cb, and Cr are each 8 
bits, and all are sampled at the same rate, 
resulting in 24 bits of data per sample clock. 
Pro-video solutions typically use a 30-bit inter- 
face, with the Y, Cb, and Cr streams each being 





Figure 6.37. 8-Bit 4:2:2 YCbCr Data Format. 
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10 bits. Y0, CbO, and CrO are the least signifi- 
cant bits. 

The 16-bit 4:2:2 YCbCr data format is 
shown in Figure 6.36. Cb and Cr are sampled 
at one-half the Y sample rate, then multiplexed 
together. The CbCr stream of active data 
words always begins with a Cb sample. Pro- 
video solutions typically use a 20-bit interface, 
with the Y and CbCr streams each being 10 
bits. 

The 8-bit 4:2:2 YCbCr data format is shown 
in Figure 6.37. The Y and CbCr streams from 
the 16-bit 4:2:2 YCbCr format are simply multi- 
plexed at 2x the sample clock rate. The YCbCr 
stream of active data words always begins with 
a Cb sample. Pro-video solutions typically use a 
10-bit interface. 

Tables 6.17 and 6.18 illustrate the 15-bit 
RGB, 16-bit RGB, and 24-bit RGB formats. For 
the 15-bit RGB format, the unused bit is some- 
times used for keying (alpha) information. R0, 
GO, and BO are the least significant bits. 



Control Signals 

In addition to the video data, there are four 
control signals: 



HSYNC# (or HREF) 
VSYNC# (orVREF) 
BLANK# (or ACTIVE) 
CLK 



horizontal sync 
vertical sync 
blanking 

lx or 2x sample clock 



For the 8-bit and 10-bit 4:2:2 YCbCr data 
formats, CLK is a 2x sample clock. For the 
other data formats, CLK is a lx sample clock. 
For sources, the control signals and video data 
are output following the rising edge of CLK. 
For receivers, the control signals and video 
data are sampled on the rising edge of CLK. 

While BLANK# is negated, active R G B' 
or YCbCr video data is present. 

HSYNC# is asserted during the horizontal 
sync time each scan line, with the leading edge 
indicating the start of a new line. The amount 



of time that HSYNC# is asserted is usually the 
same as that specified by the video standard. 

VSYNC# is asserted during the vertical 
sync time each field or frame, with the leading 
edge indicating the start of a new field or 
frame. The number of scan lines that VSYNC# 
is asserted is usually same as that specified by 
the video standard. 

For interlaced video, if the leading edges of 
VSYNC# and HSYNC# are coincident, the field 
is Field 1. If the leading edge of VSYNC# 
occurs mid-line, the field is Field 2. For nonin- 
terlaced video, the leading edge of VSYNC# 
indicates the start of a new frame. Figure 6.38 
illustrates the typical HSYNC# and VSYNC# 
relationships. 

Some products use different signal names 
(such as HREF, VREF, and ACTIVE) , different 
polarity, and slightly different signal timing. 
Some products can also transfer data and con- 
trol information using both edges of the clock 
to reduce pin count or to be able to handle 
HDTV data rates without increasing pin count. 

Receiver Considerations 

Assumptions should not be made about 
the number of samples per line or horizontal 
blanking interval. Otherwise, the implementa- 
tion may not work with all sources. 

To ensure compatibility between various 
sources, horizontal counters should be reset 
by the leading edge of HSYNC#, not by the 
trailing edge of BLANK#. 

To handle real-world sources, a receiver 
should use a window for detecting whether 
Field 1 or Field 2 is present. For example, if the 
leading edge of VSYNC# occurs within +64 lx 
clock cycles of the leading edge of HSYNC#, 
the field is Field 1. Otherwise, the field is Field 
2 . 

Some video sources indicate sync timing 
by having Y data be an 8-bit value less than 
0x10. However, most video ICs do not do this. 
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R2 
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G3 


R7 
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Y3 


Y3 
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Y3 
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G5 


RO 


Y2 


Y2 
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Y2 
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G1 
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Y1 
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R4 
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YO 
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B1 


B1 
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BO 
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BO 


BO 


CbO 
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Table 6.17. Transferring YCbCr and RGB Data over a 12-bit, 16-bit, or 24-bit Interface. *Many 
designs alternately use the red channel to transfer the multiplexed CbCr data. 





IC Component Interfaces 153 




Table 6.18. Transferring YCbCr and RGB Data over a 32-bit Interface 
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START OF FIELD 1 
OR FRAME 



HSYNC# 




VSYNC# 




Hh 



START OF FIELD 2 



HSYNC# 




VSYNC# 




Figure 6.38. Typical HSYNC# and VSYNC# Relationships (Not to Scale). Some products use 
different signal names (such as HREF, VREF, and ACTIVE), different polarity and slightly different 
signal timing. 



In addition, to allow real-world video and test 
signals to be passed through with minimum 
disruption, many ICs now allow the Y data to 
have a value less than 0x10 during active video. 
Thus, receiver designs assuming sync timing 
is present on the Y channel may no longer 
work. 

Video Module Interface (VMI) 

VMI (Video Module Interface) was devel- 
oped in cooperation with several multimedia 
IC manufacturers. The goal was to standardize 
the video interfaces between devices such as 
MPEG decoders, NTSC/PAL decoders, and 
graphics chips. 



Video Data Formats 

The VMI specification specifies an 8-bit 
4:2:2 YCbCr data format as shown in Figure 
6.39. Many devices also support the other 
YCbCr and R G B' formats discussed in the 
“BT.601 Video Interface” section. 

Control Signals 

In addition to the video data, there are four 
control signals: 

HREF horizontal blanking 

VREF vertical sync 

VACTIVE active video 

PIXCLK 2x sample clock 
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For the 8-bit and 10-bit 4:2:2 YCbCr data 
formats, PIXCLK is a 2x sample clock. For the 
other data formats, PIXCLK is a lx sample 
clock. For sources, the control signals and 
video data are output following the rising edge 
of PIXCLK. For receivers, the control signals 
and video data are sampled on the rising edge 
of PIXCLK. 

While VACTIVE is asserted, active R G B' 
or YCbCr video data is present. Although tran- 
sitions in VACTIVE are allowed, it is intended 
to allow a hardware mechanism for cropping 
video data. For systems that do not support a 
VACTIVE signal, HREF can generally be con- 
nected to VACTIVE with minimal loss of func- 
tion. 

To support video sources that do not gen- 
erate a line-locked clock, a D V AL ID# (data 
valid) signal may also be used. While 
DVALID# is asserted, valid data is present. 

HREF is asserted during the active video 
time each scan line, including during the verti- 
cal blanking interval. 



VREF is asserted for 6 scan line times, 
starting one-half scan line after the start of ver- 
tical sync. 

For interlaced video, the trailing edge of 
VREF is used to sample HREF. If HREF is 
asserted, the field is Field 1. If HREF is 
negated, the field is Field 2. For noninterlaced 
video, the leading edge of VREF indicates the 
start of a new frame. Figure 6.40 illustrates the 
typical HREF and VREF relationships. 

Receiver Considerations 

Assumptions should not be made about 
the number of samples per line or horizontal 
blanking interval. Otherwise, the implementa- 
tion may not work with all sources. 

Video data has input setup and hold times, 
relative to the rising edge of PIXCLK, of 5 and 
0 ns, respectively. 

VACTIVE has input setup and hold times, 
relative to the rising edge of PIXCLK, of 5 and 
0 ns, respectively. 

HREF and VREF both have input setup 
and hold times, relative to the rising edge of 
PIXCLK, of 5 and 5 ns, respectively. 
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Figure 6.39. VMI 8-bit 4:2:2 YCbCr Data for One Scan Line. 
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START OF FIELD 1 
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START OF FIELD 2 
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Figure 6.40. VMI Typical HREF and VREF Relationships (Not to Scale). 



BT.656 Interface 

The BT.656 interface for ICs is based on 
the pro-video BT.656-type parallel interfaces, 
discussed earlier in this chapter (Figures 6.1 
and 6.9xxx). Using EAV and SAV sequences to 
indicate video timing reduces the number of 
pins required. The timing of the H, V, and F 
signals for common video formats is illustrated 
in Chapter 4. 

Standard IC signal levels and timing are 
used, and any resolution can be supported. 

Video Data Formats 

8-bit or 10-bit 4:2:2 YCbCr data is used, as 
shown in Figures 6.1 and 6.6. Although 



sources should generate the four protection 
bits in the EAV and SAV sequences, receivers 
may choose to ignore them due to the reliabil- 
ity of point-to-point transfers between chips. 

Control Signals 

CLK is a 2x sample clock. For sources, the 
video data is output following the rising edge 
of CLK. For receivers, the video data is sam- 
pled on the rising edge of CLK. 

To be able to handle HDTV data rates, 
some designs use a 16-bit or 20-bit YCbCr 
interface (essentially two BT.656 streams, one 
for Y data and one for CbCr data) or transfer 
data using both edges of the clock. 
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Zoomed Video Port (ZV Port) 

An early standard for notebook PCs, the 
ZV Port was a point-to-point uni-directional bus 
between the PC Card host adaptor and the 
graphics controller. It enabled video data to be 
transferred real time directly from the PC Card 
into the graphics frame buffer. 

The PC Card host adaptor had a special 
multimedia mode configuration. If a non-ZV PC 
Card was plugged into the slot, the host adap- 
tor was not switched into the multimedia 
mode, and the PC Card behaved as expected. 
Once a ZV card was been plugged in and the 
host adaptor had been switched to the multi- 
media mode, the pin assignments changed. As 
shown in Table 6.19, the PC Card signals A6- 
A25, SPKR#, INPACK#, and IOIS16# are 
replaced by ZV Port video signals (Y0-Y7, 
CbCrO-CbCr7, HREF, VREF, and PCLK) and 4- 



channel audio signals (MCLK, SCLK, LRCK, 
andSDATA). 

Video Data Formats 

16-bit 4:2:2 YCbCr data was used, as shown 
in Figure 6.36. 

Control Signals 

In addition to the video data, there were 
four control signals: 

HREF horizontal reference 

VREF vertical sync 

PCLK lx sample clock 

HREF, VREF, and PCLK had the same tim- 
ing as the VMI interface discussed earlier in 
this chapter. 



PC Card 
Signal 


ZV Port 
Signal 


PC Card 
Signal 


ZV Port 
Signal 


PC Card 
Signal 


ZV Port 
Signal 


A25 


CbCr7 


A17 


Y1 


A9 


YO 


A24 


CbCr5 


A16 


CbCr2 


A8 


Y2 


A23 


CbCr3 


A15 


CbCr4 


A7 


SCLK 


A22 


CbCrl 


A14 


Y6 


A6 


MCLK 


A21 


CbCrO 


A13 


Y4 


SPKR# 


SDATA 


A20 


Y7 


A12 


CbCr6 


IOIS16# 


PCLK 


A19 


Y5 


All 


VREF 


INPACK# 


LRCK 


A18 


Y3 


A10 


HREF 







Table 6.19. PC Card vs. ZV Port Signal Assignments. 
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Video Interface Port (VIP) 

The VESA VIP specification is an enhance- 
ment to the BT.656 interface for ICs, previ- 
ously discussed. The primary application is to 
interface up to four devices to a graphics con- 
troller chip, although the concept can easily be 
applied to other applications. 

There are three sections to the interface: 

Host Interface: 

VIPCLK host dock 

HAD0-HAD7 host address/data bus 

HCTL host control 

Video Interface: 

PIXCLK video sample clock 

VID0-VID7 lower video data bus 

VIDA, VTDB 10-bit data extension 

XPIXCLK video sample clock 

XVID0-XVID7 upper video data bus 

XVIDA, XVIDB 10-bit data extension 

System Interface: 

VRST# reset 

VIRQ# interrupt request 

The host interface signals are provided by 
the graphics controller. Essentially, a 2-, 4-, or 
8-bit version of the PCI interface is used. VIP- 
CLK has a frequency range of 25-33 MHz. PIX- 
CLK and XPIXCLK have a maximum 
frequency of 75 and 80 MHz, respectively. 

Video Interface 

As with the BT.656 interface, special four- 
word sequences are inserted into the 8-bit or 
10-bit 4:2:2 YCbCr video stream to indicate the 
start of active video (SAV) and end of active 
video (EAV). These sequences also indicate 



when horizontal and vertical blanking are 
present and which field is being transmitted. 

VIP modifies the BT.656 EAV and SAV 
sequences as shown in Table 6.20. BT.656 uses 
four protection bits (P0-P3) in the status word 
since it was designed for long cable connec- 
tions between equipment. With chip-to-chip 
interconnect, this protection isn’t required, so 
the bits are used for other purposes. The tim- 
ing of the H, V, and F signals for common video 
formats are illustrated in Chapter 4. The status 
word for VIP is defined as: 

T = “0” for task B T= “1” for task A 

F = “0” for Field 1 F = “1” for Field 2 

V = “1” during vertical blanking 

H = “0” at SAV H = “1” at EAV 

The task bit, T, is programmable. If BT.656 
compatibility is required, it should always be a 
“1.” Otherwise, it may be used to indicate 
which one of two data streams are present: 
stream A = “1” and stream B = “0.” Alternately, 
T may be a “0” when raw 2x oversampled VBI 
data is present, and a “1” otherwise. 

The noninterlaced bit, N, indicates 
whether the source is progressive (“1”) or 
interlaced (“0”). 

The repeat bit, R, is a “1” if the current 
field is a repeat field. This occurs only during 
3:2 pull-down. The repeat bit (R), in conjunc- 
tion with the noninterlaced bit (N) , enables the 
graphics controller to handle Bob and Weave, 
as well as 3:2 pull-down (further discussed in 
Chapter 7) , in hardware. 

The extra flag bit, E, is a “1” if another byte 
follows the EAV. Table 6.21 illustrates the extra 
flag byte. This bit is valid only during EAV 
sequences. If the E bit in the extra byte is “1,” 
another extra byte immediately follows. This 
allows chaining any number of extra bytes 
together as needed. 
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8-bit Data 


D7 

(MSB) 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


preamble 


1 


1 


1 


1 


1 


1 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


status word 


T 


F 


V 


H 


N 


R 


0 


E 



Table 6.20. VIP EAV and SAV Sequence. 





8-bit Data 


D7 

(MSB) 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


extra byte 


DO 


user defined 


E 



Table 6.21. VIP EAV Extra Byte. 



Unlike pro-video interfaces, code 0x00 may 
be used during active video data to indicate an 
invalid video sample. This is used to accommo- 
date scaled video and square pixel timing. 

Video Data Formats 

In the 8-bit mode (Figure 6.41), the video 
interface is similar to BT.656, except for the dif- 
ferences mentioned. XVID0-XVID7 are not 
used. 

In the 16-bit mode (Figure 6.42), SAY 
sequences, EAV sequences, Y video data, ancil- 
lary packet headers, and even-numbered ancil- 
lary data values are transferred across the 
lower 8 bits (VID0-VID7) . CbCr video data and 
odd-numbered ancillary data values are trans- 
ferred across the upper 8 bits (XVID0-XVID7) . 



Note that “skip data” (value 0x00) during 
active video must also appear in 16-bit format 
to preserve the 16-bit data alignment. 

10-bit video data is supported by the VIDA, 
VIDB, XVIDA, and XVIDB signals. VIDA and 
XVIDA are the least significant bits. 

Ancillary Data 

Ancillary data packets are used to transmit 
information (such as digital audio, closed cap- 
tioning, and teletext data) during the blanking 
intervals, as shown in Table 6.22. Unlike pro- 
video interfaces, the 0x00 and OxFF values 
may be used by the ancillary data. Note that 
the ancillary data formats were defined prior to 
many of the pro-video ancillary data formats, 
and therefore may not match. 
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START OF DIGITAL LINE 



EAV CODE BLANKING 



SAV CODE 



H SIGNAL 



START OF DIGITAL ACTIVE LINE 



CO-SITED CO-SITED 



NEXT LINE 




4 



268 



4 



1440 



1716 



VIP 

4 : 2:2 

VIDEO 



Figure 6.41. VIP 8-Bit Interface Data for One Scan Line. 480i; 720 active 
samples per line; 27 MHz clock. 




H SIGNAL 




Figure 6.42. VIP 16-Bit Interface Data for One Scan Line. 1080i; 1920 active 
samples per line; 74.176 or 74.25 MHz clock. 
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8-bit Data 


D7 

(MSB) 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


ancillary data 
flag (ADF) 


0 


0 


0 


0 


0 


0 


0 


0 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


data ID (DID) 


D6 


EP 


0 


1 


0 


DID2 


DID1 


DIDO 


SDID 


D6 


EP 


user defined value 


data count (DC) 


D6 


EP 


DC5 


DC4 


DC3 


DC2 


DC1 


DC0 


internal data ID 0 


user defined value 


internal data ID 1 


user defined value 


data word 0 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 






data word N 


D7 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


checksum 


D6 


EP 


CS5 


CS4 


CS3 


CS2 


CS1 


CSO 


optional fill data 


D6 


EP 


0 


0 


0 


0 


0 


0 



Note : 

EP = even parity for D0-D5. 



Table 6.22. VIP Ancillary Data Packet General Format. 



DID2 of the DID field indicates whether 
Field 1 or Field 2 ancillary data is present: 

0 Field 1 

1 Field 2 

DID 1-DIDO of the DID field indicate the 
type of ancillary data present: 

00 start of field 

01 sliced VBI data, lines 1-23 

10 end of field VBI data, line 23 

11 sliced VBI data, line 24 to end of field 



The data count value (DC) specifies the 
number of D-words (Tbyte blocks) of ancillary 
data present. Thus, the number of data words 
in the ancillary packet after the DID must be a 
multiple of four. 1-3 optional fill bytes may be 
added after the checksum data to meet this 
requirement. 

When DID 1-DIDO are “00” or “10,” no 
ancillary data or checksum is present. The 
data count (DC) value is “00000,” and is the 
last field present in the packet. 
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Consumer Component 
Interfaces 

Many solutions for transferring digital 
video between equipment have been devel- 
oped over the years. HD MI, originally derived 
from DVI, is the most popular digital video 
interfaces for consumer equipment. 

Digital Visual Interface (DVI) 

In 1998, the Digital Display Working 
Group (DDWG) was formed to address the 
need for a standardized digital video interface 
between a PC and VGA monitor, as illustrated 
in Figure 6.43. The DVI 1.0 specification was 
released in April 1999. 

Designed to transfer uncompressed real- 
time digital video, DVI supports PC graphics 
resolutions beyond 1600 x 1200 and HDTV res- 
olutions, including 720p, 1080i, and 1080p. 

In 2003, the consumer electronics industry 
started adding DVI outputs to DVD players 
and cable/satellite set-top boxes. DVI inputs 
also started appearing on digital televisions 
and LCD/ plasma monitors. 



Technology 

DVI is based on the Digital Flat Panel 
(DFP) Interface, enhancing it by supporting 
more formats and timings. It also includes sup- 
port for the High-bandwidth Digital Content 
Protection (HDCP) specification to deter unau- 
thorized copying of content. 

DVI also supports VESA’s Extended Dis- 
play Identification Data (EDID) standard, Dis- 
play Data Channel (DDC) standard (used to 
read the EDID) , and Monitor Timing Specifica- 
tion (DMT). 

DDC and EDID enable automatic display 
detection and configuration. Extended Display 
Identification Data (EDID) was created to 
enable plug and play capabilities of displays. 
Data is stored in the display, describing the 
supported video formats. This information is 
supplied to the source device, over DVI, at the 
request of the source device. The source 
device then chooses its output format, taking 
into account the format of the original video 
stream and the formats supported by the dis- 
play. The source device is responsible for the 
format conversions necessary to supply video 
in an understandable form to the display. 



WITHOUT DVI 





Figure 6.43. Using DVI to Connect a VGA Monitor to a PC. 
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In addition, the CEA-861 standard specifies 
mandatory and optionally supported resolu- 
tions and timings, and how to include data 
such as aspect ratio and format information. 

TMDS Links 

DVI uses transition-minimized differential 
signaling (TMDS) . Eight bits of video data are 
converted to a 10-bit transition-minimized, DC- 
balanced value, which is then serialized. The 
receiver deserializes the data, and converts it 
back to 8 bits. Thus, to transfer digital R G B' 
data requires three TMDS signals that com- 
prise one TMDS link. 

“TFT data mapping” is supported as the 
minimum requirement: 1 pixel per clock, 8 bits 
per channel, MSB justified. 

Either one or two TMDS links may be 
used, as shown in Figures 6.44 and 6.45, 
depending on the formats and timing required. 
A system supporting two TMDS links must be 
able to switch dynamically between formats 
requiring a single link and formats requiring a 
dual link. A single DVI connector can handle 
two TMDS links. 



A single TMDS link supports resolutions 
and timings using a video sample rate of 25- 
165 MHz. Resolutions and timings using a 
video sample rate of 165-330 MHz are imple- 
mented using two TMDS links, with each 
TMDS link operating at one-half the frequency. 
Thus, the two TMDS links share the same 
clock and the bandwidth is shared evenly 
between the two links. 

Video Data Formats 

Typically, 24-bit R'G'B' data is transferred 
over a link. For applications requiring more 
than 8 bits per color component, the second 
TMDS link may be used for the additional least 
significant bits. 

For PC applications, RGB' data typically 
has a range of OxOO-OxFF. For consumer appli- 
cations, R'G'B' data typically has a range of 
OxlO-OxEB (values less than 0x10 or greater 
than OxEB may be occasionally present due to 
processing). 



TMDS TMDS 

TRANSMITTER LINK 



TMDS 

RECEIVER 



B0-B7 

VSYNC 

HSYNC 

DE 

G0-G7 

CTLO 

CTL1 



R0-R7 

CTL2 

CTL3 



CLK 




B0-B7 

VSYNC 

HSYNC 

DE 

G0-G7 

CTLO 

CTL1 



R0-R7 

CTL2 

CTL3 



CLK 



Figure 6.44. DVI Single TMDS Link. 
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TMDS 

TRANSMITTER 



DUAL TMDS 
LINK 



TMDS 

RECEIVER 



B0-B7 

VSYNC 

HSYNC 

DE 

G0-G7 

CTLO 

CTL1 



R0-R7 

CTL2 

CTL3 



CLK 

B0-B7 

CTL4 

CTL5 



G0-G7 

CTL6 

CTL7 



R0-R7 

CTL8 

CTL9 




B0-B7 
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HSYNC 

DE 

G0-G7 

CTLO 

CTL1 
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CTL2 

CTL3 
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CTL8 
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Figure 6.45. DVI Dual TMDS Link. 
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Control Signals 

In addition to the video data, DVI transmit- 
ter and receiver chips typically use up to 14 
control signals for interfacing to other chips in 
the system: 



HSYNC 

VSYNC 

DE 

CTL0-CTL3 

CTL4-CTL9 

CLK 



horizontal sync 
vertical sync 
data enable 
reserved (link 0) 
reserved (link 1) 
lx sample clock 



While DE is a “1,” active video is pro- 
cessed. While DE is a “0,” the HSYNC, 
VSYNC, and CTL0-CTL9 signals are pro- 
cessed. HSYNC and VSYNC may be either 
polarity. 

One issue is that some HDTVs use the fall- 
ing edge of the YPbPr tri-level sync, rather 
than the center (rising edge), for horizontal 
timing. When displaying content from DVI, 
this results in the image shifting by 2.3%. Pro- 
viding the ability to adjust the DVI embedded 
sync timing relative to the YPbPr tri-level sync 
timing is a useful capability in this case. Many 
fixed-pixel displays, such as DLP, LCD, and 
plasma, instead use the DE signal as a timing 
reference, avoiding the issue. 



Digital-Only (DVI-D) Connector 

The digital-only connector, which supports 
dual link operation, contains 24 contacts 
arranged as three rows of eight contacts, as 
shown in Figure 6.46. Table 6.23 lists the pin 
assignments. 

Digital-Analog (DVI-I) Connector 

In addition to the 24 contacts used by the 
digital-only connector, the 29-contact digital- 
analog connector adds five additional contacts 
to support analog video as shown in Figure 
6.47. Table 6.24 lists the pin assignments. 



HSYNC 


horizontal sync 


VSYNC 


vertical sync 


RED 


analog red video 


GREEN 


analog green video 


BLUE 


analog blue video 



The operation of the analog signals is the 
same as for a standard VGA connector. 

DVI-A is available as a plug (male) connec- 
tor only and mates to the analog-only pins of a 
DVI-I connector. DVI-A is only used in adapter 
cables, where there is the need to convert to or 
from a traditional analog VGA signal. 




Figure 6.46. DVI-D Connector. 



Figure 6.47. DVI-I Connector. 
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Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


D2- 


9 


Dl- 


17 


DO- 


2 


D2 


10 


D1 


18 


DO 


3 


shield 


11 


shield 


19 


shield 


4 


D4- 


12 


D3- 


20 


D5- 


5 


D4 


13 


D3 


21 


D5 


6 


DDC SCL 


14 


+5V 


22 


shield 


7 


DDC SDA 


15 


ground 


23 


CLK 


8 


reserved 


16 


Hot Plug Detect 


24 


CLK- 



Table 6.23. DVI-D Connector Signal Assignments. 



Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


D2- 


9 


Dl- 


17 


DO- 


2 


D2 


10 


Dl 


18 


DO 


3 


shield 


11 


shield 


19 


shield 


4 


D4- 


12 


D3- 


20 


D5- 


5 


D4 


13 


D3 


21 


D5 


6 


DDC SCL 


14 


+5V 


22 


shield 


7 


DDC SDA 


15 


ground 


23 


CLK 


8 


VSYNC 


16 


Hot Plug Detect 


24 


CLK- 


Cl 


RED 


C2 


GREEN 


C3 


BLUE 


C4 


HSYNC 


C5 


ground 





Table 6.24. DVI-I Connector Signal Assignments. 






Consumer Component Interfaces 167 



High-Definition Multimedia Interface 
(HDMI) 

Although DVI handles transferring uncom- 
pressed real-time digital RGB video to a dis- 
play, the consumer electronics industry 
preferred a smaller, more flexible solution, 
based on DVI technology. In April 2002, the 
HDMI working group was formed by Hitachi, 
Matsushita Electric (Panasonic), Philips, Sili- 
con Image, Sony, Thomson, and Toshiba. 

HDMI is capable of replacing up to eight 
audio cables (7.1 channels) and up to three 
video cables with a single cable, as illustrated 
in Figure 6.48. In 2004, the consumer electron- 
ics industry started adding HDMI outputs to 
DVD players and cable/ satellite set-top boxes. 
HDMI inputs started appearing on digital tele- 
visions and monitors in 2005. 

Through the use of an adaptor cable, 
HDMI is backwards compatible with equip- 
ment using DVI and the CEA-861 DTV profile. 
However, the advanced features of HDMI, 
such as digital audio, Consumer Electronics 
Control (used to enable passing control com- 
mands between equipment) and color gamut 
metadata, are not available. 



Technology 

HDMI, based on DVI, supports VESA’s 
Extended Display Identification Data (EDID) 
standard and Display Data Channel (DDC) 
standard (used to read the EDID) . 

In addition, the CEA-861 standard specifies 
mandatory and optionally supported resolu- 
tions and timings, and how to include data 
such as aspect ratio and format information. 

HDMI also supports the High-bandwidth 
Digital Content Protection (HDCP) specifica- 
tion to deter unauthorized copying of content. 
A common problem is sources not polling the 
TV often enough (twice per second) to see if its 
HDCP circuit is active. This results in snow if 
the TV’s HDMI input is deselected, then later 
selected again. 

The 19-pin Type A connector uses a single 
TMDS link and can therefore carry video sig- 
nals with a 25-340 MHz sample rate. Video 
with sample rates below 25 MHz (i.e. 13.5 MHz 
480i and 576i) are transmitted using a pixel- 
repetition scheme. 

To support video signals sampled at 
greater than 340 MHz, the dual-link capability 
of the 29-pin Type B connector is used. 

The 19-pin Type C connector, designed for 
mobile applications, is a smaller version of the 
Type A connector. 
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OR 


Y 


PB 


PR 
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WITHOUT HDMI 
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OR 
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HDMI 
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WITH HDMI 



Figure 6.48. Using HDMI Eliminates Confusing Cable Connections for Consumers. 
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Video Data Formats 

HDMI supports R G B', 4:4:4 YCbCr, 4:2:2 
YCbCr, 4:4:4 xvYCC and 4:2:2 xvYCC. 24, 30, 
36 or 48 bits per pixel can be transferred; color 
depths greater than 24 bits per pixel are called 
“deep color”. 

Video data is either “full range” (0x00- 
OxFF for 8-bit RGB data) or “limited range” 
(OxlO-OxEB for 8-bit RGB or Y data, 0x10- 
OxFO for 8-bit CbCr data; values less than or 
greater than these may be present) . 

RGB' data may be either “full range” or 
“limited range”, except for the 640 x 480 reso- 
lution which must always be “full range”. 

YCbCr and xvYCC video data must always 
be “limited range”. 

Audio Data Formats 

Driven by the DVD-Audio standard, audio 
support consists of 1-8 uncompressed audio 
streams with a sample rate of up to 48, 96, or 
192 kHz, depending on the video format. It can 
alternately carry a compressed multi-channel 
audio stream at sample rates up to 192 kHz. 



Digital Flat Panel (DFP) Interface 

The VESA DFP interface was developed 
for transferring uncompressed digital video 
from a computer to a digital flat panel display. 
It supports VESA’s Plug and Display (P&D) 
standard, Extended Display Identification Data 
(EDID) standard, Display Data Channel 
(DDC) standard, and Monitor Timing Specifi- 
cation (DMT). DDC and EDID enable auto- 
matic display detection and configuration. 
Only TFT data mapping is supported: 1 pixel 
per clock, 8 bits per channel, MSB justified. 

Like DVI, DFP uses transition-minimized 
differential signaling (TMDS) . 8 bits of video 
data are converted to a 10-bit transition-mini- 
mized, DC-balanced value, which is then serial- 
ized. The receiver deserializes the data, and 
converts it back to 8 bits. Thus, to transfer dig- 
ital R'G'B' data requires three TMDS signals 
that comprise one TMDS link. Cable lengths 
may be up to 5 meters. 
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Figure 6.49. DFP TMDS Link. 
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Figure 6.50. DFP Connector. 



TMDS Links 

A single TMDS link, as shown in Figure 
6.49, supports formats and timings requiring a 
clock rate of 22.5-160 MHz. 



Video Data Formats 

24-bit R'G'B' data is transferred over the 
link, as shown in Figure 6.49. 



Control Signals 

In addition to the video data, DFP transmit- 
ter and receiver chips typically use up to 8 con- 
trol signals for interfacing to other chips in the 
system: 



HSYNC horizontal sync 

VSYNC vertical sync 

DE data enable 

CTL0-CTL3 reserved 
CLK lx sample clock 

While DE is a “1,” active video is pro- 
cessed. While DE is a “0,” the HSYNC, 
VSYNC, and CTL0-CTL3 signals are pro- 
cessed. HSYNC and VSYNC may be either 
polarity. 



Pin 


Signal 


Pin 


Signal 


1 


D1 


11 


D2 


2 


Dl- 


12 


D2- 


3 


shield 


13 


shield 


4 


shield 


14 


shield 


5 


CLK 


15 


DO 


6 


CLK- 


16 


DO- 


7 


ground 


17 


no connect 


8 


+5V 


18 


Hot Plug Detect 


9 


no connect 


19 


DDC SDA 


10 


no connect 


20 


DDC SCL 



Table 6.25. DFP Connector Signal 
Assignments. 



Connector 

The 20-pin mini-D ribbon (MDR) connec- 
tor contains 20 contacts arranged as two rows 
of ten contacts, as shown in Figure 6.25. Table 
6.39 lists the pin assignments. 
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Open LVDS Display Interface (OpenLDI) 

OpenLDI was developed for transferring 
uncompressed digital video from a computer 
to a digital flat panel display. It enhances the 
FPD-Link standard used to drive the displays 
of laptop computers, and adds support for 
VESA’s Plug and Display (P&D) standard, 
Extended Display Identification Data (EDID) 
standard, and Display Data Channel (DDC) 
standard. DDC and EDID enable automatic 
display detection and configuration. 

Unlike DVI and DFP, OpenLDI uses low- 
voltage differential signaling (LVDS). Cable 
lengths may be up to 10 meters. 

LVDS link 

The LVDS link, as shown in Figure 6.51, 
supports formats and timings requiring a clock 
rate of 32.5-160 MHz. 

Eight serial data lines (A0-A7) and two 
sample clock lines (CLK1 and CLK2) are used. 
The number of serial data lines actually used is 
dependent on the pixel format, with the serial 
data rate being 7x the sample clock rate. The 
CLK2 signal is used in the dual pixel modes for 
backwards compatibility with FPD-Link receiv- 
ers. 

Video Data Formats 

18-bit single pixel, 24-bit single pixel, 18-bit 
dual pixel, or 24-bit dual pixel RGB' data is 
transferred over the link. Table 6.26 illustrates 
the mapping between the pixel data bit number 
and the OpenLDI bit number. 

The 18-bit single pixel RGB' format uses 
three 6-bit RGB' values: R0-R5, G0-G5, and 
B0-B5. OpenLDI serial data lines A0-A2 are 
used to transfer the data. 

The 24-bit single pixel RGB' format uses 
three 8-bit R'G'B' values: R0-R7, G0-G7, and 



B0-B7. OpenLDI serial data lines A0-A3 are 
used to transfer the data. 

The 18-bit dual pixel R'G'B' format repre- 
sents two pixels as three upper/lower pairs of 
6-bit RGB' values: RU0-RU5, GU0-GU5, 
BU0-BU5, RL0-RL5, GL0-GL5, BL0-BL5. 
Each upper/lower pair represents two pixels. 
OpenLDI serial data lines A0-A2 and A4-A6 
are used to transfer the data. 

The 24-bit dual pixel R'G'B' format repre- 
sents two pixels as three upper/lower pairs of 
8-bit R'G'B' values: RU0-RU7, GU0-GU7, 
BU0-BU7, RL0-RL7, GL0-GL7, BL0-BL7. 
Each upper/lower pair represents two pixels. 
OpenLDI serial data lines A0-A7 are used to 
transfer the data. 

Control Signals 

In addition to the video data, OpenLDI 
transmitter and receiver chips typically use up 
to seven control signals for interfacing to other 
chips in the system: 



HSYNC 


horizontal sync 


VSYNC 


vertical sync 


DE 


data enable 


CNTLE 


reserved 


CNTLF 


reserved 


CLK1 


lx sample clock 


CLK2 


lx sample clock 



During unbalanced operation, the DE, 
HSYNC, VSYNC, CNTLE, and CNTLF levels 
are sent as unencoded bits within the A2 and 
A6 bitstreams. 

During balanced operation (used to mini- 
mize short- and long-term DC bias), a DC Bal- 
ance bit is sent within each of the A0-A7 
bitstreams to indicate whether the data is 
unmodified or inverted. Since there is no room 
left for the control signals to be sent directly, 
the DE level is sent by slightly modifying the 
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Figure 6.51. OpenLDI LVDS Link. 



18 Bits per Pixel 
Bit Number 


24 Bits per Pixel 
Bit Number 


OpenLDI 
Bit Number 


5 


7 


5 


4 


6 


4 


3 


5 


3 


2 


4 


2 


1 


3 


1 


0 


2 


0 




1 


7 


0 


6 



Table 6.26. OpenLDI Bit Number Mappings, 
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timing of the falling edge of the CLK1 and 
CLK2 signals. The HSYNC, VSYNC, CNTLE, 
and CNTLF levels are sent during the blanking 
intervals using 7-bit code words on the AO, Al, 
A5, and A4 signals, respectively. 

Connector 

The 36-pin mini-D ribbon (MDR) connec- 
tor is similar to the one shown in Figure 6.50, 
except that there are two rows of eighteen con- 
tacts. Table 6.27 lists the pin assignments. 

Gigabit Video Interface (GVIF) 

The Sony GVIF was developed for transfer- 
ring uncompressed digital video using a single 
differential signal, instead of the multiple sig- 
nals that DVI, DFP, and OpenFDI use. Cable 
lengths may be up to 10 meters. 



GVIF Link 

The GVIF link, as shown in Figure 6.52, 
supports formats and timings requiring a clock 
rate of 20-80 MHz. For applications requiring 
higher clock rates, more than one GVIF link 
may be used. 

The serial data rate is 24x the sample clock 
rate for 18-bit RGB' data, or 30x the sample 
clock rate for 24-bit R G B' data. 

Video Data Formats 

18-bit or 24-bit R G B' data, plus timing, is 
transferred over the link. The 18-bit R'G'B ' for- 
mat uses three 6-bit R G B' values: R0-R5, GO- 
GS, and B0-B5. The 24-bit R'G'B' format uses 
three 8-bit R'G'B' values: R0-R7, G0-G7, and 
B0-B7. 



Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


A0- 


13 


+5V 


25 


reserved 


2 


Al- 


14 


A4- 


26 


reserved 


3 


A2- 


15 


A5- 


27 


ground 


4 


CLK1- 


16 


A6- 


28 


DDC SDA 


5 


A3- 


17 


A7- 


29 


ground 


6 


ground 


18 


CLK2- 


30 


USB- 


7 


reserved 


19 


A0 


31 


ground 


8 


reserved 


20 


Al 


32 


A4 


9 


reserved 


21 


A2 


33 


A5 


10 


DDC SCL 


22 


CLK1 


34 


A6 


11 


+5V 


23 


A3 


35 


A7 


12 


USB 


24 


reserved 


36 


CLK2 



Table 6.27. OpenLDI Connector Signal Assignments. 
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18-bit R G B' data is converted to 24-bit 
data by slicing the R'G B data into six 3-bit val- 
ues that are in turn transformed into six 4-bit 
codes. This ensures rich transitions for 
receiver PLL locking and good DC balance. 

24-bit R G B' data is converted to 30-bit 
data by slicing the R'G B data into six 4-bit val- 
ues that are in turn transformed into six 5-bit 
codes. 

Control Signals 

In addition to the video data, there are six 
control signals: 

HSYNC horizontal sync 

VSYNC vertical sync 

DE data enable 



CTLO reserved 

CTL1 reserved 

CLK lx sample clock 

If any of the HSYNC, VSYNC, DE, CTLO, 
or CTL1 signals change, during the next CLK 
cycle a special 30-bit format is used. The first 6 
bits are header data indicating the new levels 
of HSYNC, VSYNC, DE, CTLO, or CTL1. This 
is followed by 24 bits of R G B' data (unen- 
coded except for inverting the odd bits) . 

Note that during the blanking periods, 
non-video data, such as digital audio, may be 
transferred. The CTL signals may be used to 
indicate when non-video data is present. 
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Figure 6.52. GVIF Link. 
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Consumer Transport 
Interfaces 

Several transport interfaces, such as USB 
2.0, Ethernet, and IEEE 1394, are available for 
consumer products. Of course, each standard 
has its own advantages and disadvantages. 

USB 2.0 

Well known in the PC market for connect- 
ing peripherals to a PC, there is growing inter- 
est in using USB (Universal Serial Bus) 2.0 to 
transfer compressed audio/video data between 
products. 

USB 2.0 is capable of operating up to 480 
Mbps and supports an isochronous mode to 
guarantee data delivery timing. Thus, it can 
easily transfer compressed real-time audio/ 
video data from a cable/ satellite set-top box or 
DVD player to a digital television. DTCP (Digi- 
tal Transmission Copy Protection) may be 
used to encrypt the audio and video content 
over USB. 

Due to USB’s lower cost and widespread 
usage, many companies are interested in using 
USB 2.0 instead of IEEE 1394 to transfer com- 
pressed audio/video data between products. 
However, some still prefer IEEE 1394 since the 
methods for transferring various types of data 
are much better defined. 

USB On-the-Go 

With portable devices increasing in popu- 
larity, there was a growing desire for them to 
communicate directly with each other without 
requiring a PC or other USB host. 



On-the-Go addresses this desire by allow- 
ing a USB device to communicate directly with 
other On-the-Go products. It also features a 
smaller USB connector and low power features 
to preserve battery life. 

Ethernet 

With the widespread adoption of home net- 
works, DSL, and FTTH (Fiber-to-the-Home) , 
Ethernet has become a common interface for 
transporting digital audio and video data. Ini- 
tially used for file transfers, streaming of real- 
time compressed video over wired (802.3) or 
wireless (802.11) Ethernet networks is now 
becoming common. 

Ethernet supports up to 1 Gbps. DTCP/IP 
(Digital Transmission Copy Protection for 
Internet Protocol) may be used to encrypt the 
audio and video content over wired or wireless 
networks. 

IEEE 1394 

IEEE 1394 was originally developed by 
Apple Computer as Firewire. Designed to be a 
generic interface between devices, 1394 speci- 
fies the physical characteristics; separate appli- 
cation-specific specifications describe how to 
transfer data over the 1394 network. 

1394 is a transaction-based packet technol- 
ogy, using a bi-directional serial interconnect 
that features hot plug-and-play. This enables 
devices to be connected and disconnected 
without affecting the operation of other 
devices connected to the network. 

Guaranteed delivery of time-sensitive data 
is supported, enabling digital audio and video 
to be transferred in real time. In addition, mul- 
tiple independent streams of digital audio and 
video can be carried. 
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16 HOPS = 17 NODES MAX. 







BRANCHING INCREASES NODE COUNT 




Figure 6.53. IEEE 1394 Network Topology Examples. 



Specifications 

The original 1394-1995 specification sup- 
ports bit-rates of 98.304, 196.608, and 393.216 
Mbps. 

The 1394A-2000 specification clarifies 
areas that were vague and led to system 
interoperability issues. It also reduces the 
overhead lost to bus control, arbitration, bus 
reset duration, and concatenation of packets. 
1394A-2000 also introduces advanced power- 
saving features. The electrical signaling 
method is also common between 1394-1995 
and 1394A-2000, using data-strobe (DS) encod- 
ing and analog-speed signaling. 

The 1394B-2002 specification adds support 
for bit-rates of 786.432, 1572.864, and 3145.728 
Mbps. It also includes 

- 8B/10B encoding technique used by Giga- 
bit Ethernet 

- Continuous dual simplex operation 

- Longer distance (up to 100 meters over Cat5) 



- Changes the speed signaling to a more 
digital method 

- Three types of ports: Legacy (1395A com- 
patible) , Beta, and Bilingual (supports 
both Legacy and Beta). Connector keying 
ensures that incompatible connections 
cannot physically be made. 

Endian Issues 

1394 uses a big-endian architecture, defin- 
ing the most significant bit as bit 0. However, 
many processors are based on the little endian 
architecture which defines the most significant 
bit as bit 31 (assuming a 32-bit word). 

Network Topology 

Like many networks, there is no desig- 
nated bus master. The tree-like network struc- 
ture has a root node, branching out to logical 
nodes in other devices (Figure 6.53). The root 
is responsible for certain control functions, 
and is chosen during initialization. Once cho- 
sen, it retains that function for as long as it 
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remains powered on and connected to the net- 
work. 

A network can include up to 63 nodes, with 
each node (or device) specified by a 6-bit phys- 
ical identification number. Multiple networks 
may be connected by bridges, up to a system 
maximum of 1,023 networks, with each net- 
work represented by a separate 10-bit bus ID. 
Combined, the 16-bit address allows up to 
64,449 nodes in a system. Since device 
addresses are 64 bits, and 16 of these bits are 
used to specify nodes and networks, 48 bits 
remain for memory addresses, allowing up to 
256TB of memory space per node. 

Node Types 

Nodes on a 1394 bus may vary in complex- 
ity and capability (listed simplest to most com- 
plex) : 

Transaction nodes respond to asynchro- 
nous communication, implement the minimal 
set of control status registers (CSR), and 
implement a minimal configuration ROM. 

Isochronous nodes add a 24.576 MHz clock 
used to increment a cycle timer register that is 
updated by cycle start packets. 

Cycle master nodes add the ability to gener- 
ate the 8 kHz cycle start event, generate cycle 
start packets, and implement a bus timer regis- 
ter. 



Isochronous resource manager (IRM) nodes 
add the ability to detect bad self-ID packets, 
determine the node ID of the chosen IRM, and 
implement the channels available, bandwidth 
available, and bus manager ID registers. At 
least one node must be capable of acting as an 
IRM to support isochronous communication. 

Bus manager (BM) nodes are the most 
complex. This level adds responsibility for 
storing every self-ID packet in a topology map 
and analyzing that map to produce a speed 
map of the entire bus. These two maps are 
used to manage the bus. Finally, the BM must 
be able to activate the cycle master node, write 
configuration packets to allow optimization of 
the bus, and act as the power manager. 

Node Ports 

In the network topology, a one-port device 
is known as a “leaf’ device since it is at the end 
of a network branch. They can be connected to 
the network, but cannot expand the network. 

Two-port devices can be used to form 
daisy-chained topologies. They can be con- 
nected to and continue the network, as shown 
in Figure 6.53. Devices with three or more 
ports are able to branch the network to the full 
63-node capability. 



DATA 



STROBE 



STROBE 

XOR 

DATA 



Figure 6.54. IEEE 1394 Data and Strobe Signal Timing. 
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It is important to note that no loops or par- 
allel connections are allowed within the net- 
work. Also, there are no reserved 
connectors — any connector may be used to 
add a new device to the network. 

Since 1394-1995 mandates a maximum of 
16 cable hops between any two nodes, a maxi- 
mum of 17 peripherals can be included in a net- 
work if only two-port peripherals are used. 
Later specifications implement a ping packet to 
measure the round-trip delay to any node, 
removing the 16 hop limitation. 

For 1394-1995 and 1394A-2000, a 4- or 6-pin 
connector is used. The 6-pin connector can 
provide power to peripherals. For 1394B-2002, 
the 9-pin Beta and Bilingual connector 
includes power, two extra pins for signal integ- 
rity, and one pin for reserved for future use. 



Figure 6.54 illustrates the 1394-1995 and 
1394A-2000 data and strobe timing. The strobe 
signal changes state on every bit period for 
which the data signal does not. Therefore, by 
exclusive-ORing the data and strobe signals, 
the clock is recovered. 

Physical Layer 

The typical hardware topology of a 1394 
network consists of a physical layer (PHY) and 
link layer (LINK), as shown in Figure 6.55. 
The 1394-1995 standard also defined two soft- 
ware layers, the transaction layer and the bus 
management layer, parts of which may be 
implemented in hardware. 

The PHY transforms the point-to-point net- 
work into a logical physical bus. Each node is 
also essentially a data repeater since data is 
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Figure 6.55. IEEE 1394 Typical Physical and Link Layer Block Diagrams. 
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reclocked at each node. The PHY also defines 
the electrical and mechanical connection to the 
network. Physical signaling circuits and logic 
responsible for power-up initialization, arbitra- 
tion, bus-reset sensing, and data signaling are 
also included. 

Link Layer 

The Link provides interfacing between the 
physical layer and application layer, formatting 
data into packets for transmission over the net- 
work. It supports both asynchronous and iso- 
chronous data. 

Asynchronous Data 

Asynchronous packets are guaranteed 
delivery since after an asynchronous packet is 
received, the receiver transmits an acknowl- 
edgment to the sender, as shown in Figure 
6.56. However, there is no guaranteed band- 
width. This type of communication is useful for 
commands, non-real-time data, and error-free 
transfers. 



The delivery latency of asynchronous 
packets is not guaranteed and depends upon 
the network traffic. However, the sender may 
continually retry until an acknowledgment is 
received. 

Asynchronous packets are targeted to one 
node on the network or can be sent to all 
nodes, but cannot be broadcast to a subset of 
nodes on the bus. 

The maximum asynchronous packet size 
is: 

512 * (n / 100) bytes 

n = network speed in Mbps 

Isochronous Data 

Isochronous communications have a guar- 
anteed bandwidth, with up to 80% of the net- 
work bandwidth available for isochronous use. 
Up to 63 independent isochronous channels 
are available, although the 1394 Open Host 
Controller Interface (OHCI) currently only 
supports 4-32 channels. This type of communi- 
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Figure 6.56. IEEE 1394 Isochronous and Asynchronous Packets. 
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cation is useful for real-time audio and video 
transfers since the maximum delivery latency 
of isochronous packets is calculable and may 
be targeted to multiple destinations. However, 
the sender may not retry sending a packet. 

The maximum isochronous packet size is: 

1024 * (n / 100) bytes 

n = network speed in Mbps 

Isochronous operation guarantees a time 
slice each 125 ps. Since time slots are guaran- 
teed, and isochronous communication takes 
priority over asynchronous, isochronous band- 
width is assured. 

Once an isochronous channel is estab- 
lished, the sending device is guaranteed to 
have the requested amount of bus time for that 
channel every isochronous cycle. Only one 
device may send data on a particular channel, 
but any number of devices may receive data on 
a channel. A device may use multiple isochro- 
nous channels as long as capacity is available. 

Transaction Layer 

The transaction layer supports asynchro- 
nous write, read, and lock commands. A lock 
combines a write with a read by producing a 
round trip routing of data between the sender 
and receiver, including processing by the 
receiver. 

Bus Management Layer 

The bus management layer control func- 
tions of the network at the physical, link, and 
transaction layers. 

Digital Transmission Content Protection 
(DTCP) 

To prevent unauthorized copying of con- 
tent, the DTCP system was developed. 
Although originally designed for 1394, it is 



applicable to any digital network that supports 
bi-directional communications, such as USB 
and Ethernet. 

Device authentication, content encryption, 
and renewability (should a device ever be com- 
promised) are supported by DTCP. The Digital 
Transmission Licensing Administrator 
(DTLA) licenses the content protection system 
and distributes cipher keys and device certifi- 
cates. 

DTCP outlines four elements of content 
protection: 

1. Copy control information (CCI) 

2. Authentication and key exchange (AKE) 

3. Content encryption 

4. System renewability 

Copy Control Information (CCI) 

CCI allows content owners to specify how 
their content can be used, such as “copy- 
never,” “copy-one-generation,” “no-more-cop- 
ies,” and “copy-free.” DTCP is capable of 
securely communicating copy control informa- 
tion between devices. Two different CCI mech- 
anisms are supported: embedded and 

encryption mode indicator. 

Embedded CCI is carried within the con- 
tent stream. Tampering with the content 
stream results in incorrect decryption, main- 
taining the integrity of the embedded CCI. 

The encryption mode indicator (EMI) pro- 
vides a secure, yet easily accessible, transmis- 
sion of CCI by using the two most significant 
bits of the sync field of the isochronous packet 
header. Devices can immediately determine 
the CCI of the content stream without decod- 
ing the content. If the two EMI bits are tam- 
pered with, the encryption and decryption 
modes do not match, resulting in incorrect 
content decryption. 
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Authentication and Key Exchange (AKE) 

Before sharing content, a device must first 
verify that the other device is authentic. DTCP 
includes a choice of two authentication levels: 
full and restricted. Full authentication can be 
used with all content protected by the system. 
Restricted authentication enables the protec- 
tion of “copy-one-generation” and “no-more- 
copies” content only. 

Full Authentication 

Compliant devices are assigned a unique 
public/ private key pair and a device certificate 
by the DTLA, both stored within the device so 
as to prevent their disclosure. In addition, 
devices store other necessary constants and 
keys. 

Full authentication uses the public key- 
based digital signature standard (DSS) and 
Diffie-Hellman (DH) key exchange algo- 
rithms. DSS is a method for digitally signing 
and verifying the signatures of digital docu- 
ments to verify the integrity of the data. DH 
key exchange is used to establish control-chan- 
nel symmetric cipher keys, which allows two 
or more devices to generate a shared key. 

Initially, the receiver sends a request to the 
source to exchange device certificates and ran- 
dom challenges. Then, each device calculates a 
DH key exchange first-phase value. The 
devices then exchange signed messages that 
contain the following elements: 

1. The other device’s random challenge 

2. The DH key-exchange first-phase value 

3. The renewability message version number 
of the newest system renewability message 
(SRM) stored by the device 

The devices check the message signatures 
using the other device’s public key to verify 
that the message has not been tampered with 



and also verify the integrity of the other 
device’s certificate. Each device also examines 
the certificate revocation list (CRL) embedded 
in its system renewability message (SRM) to 
verify that the other device’s certificate has not 
been revoked due to its security having been 
compromised. If no errors have occurred, the 
two devices have successfully authenticated 
each other and established an authorization 
key. 

Restricted Authentication 

Restricted authentication may be used 
between sources and receivers for the 
exchange of “copy-one-generation” and “no- 
more-copies” contents. It relies on the use of a 
shared secret to respond to a random chal- 
lenge. 

The source initiates a request to the 
receiver, requests its device ID, and sends a 
random challenge. After receiving the chal- 
lenge back from the source, the receiver com- 
putes a response and sends it to the source. 

The source compares this response with 
similar information generated by the source 
using its service key and the ID of the receiver. 
If the comparison matches its own calculation, 
the receiver has been verified and authenti- 
cated. The source and receiver then each cal- 
culate an authorization key. 

Content Encryption 

To ensure interoperability, all compliant 
devices must support the 56-bit M6 baseline 
cipher. Additional content protection may be 
supported by using additional, optional 
ciphers. 

System Renewability 

Devices that support full authentication 
can receive and process SRMs that are created 
by the DTLA and distributed with content. Sys- 
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tern renewability is used to ensure the long- 
term system integrity by revoking the device 
IDs of compromised devices. 

SRMs can be updated from other compli- 
ant devices that have a newer list, from media 
with prerecorded content, or via compliant 
devices with external communication capabil- 
ity (Internet, phone, cable, network, and so 
on). 

Example Operation 

For this example, the source has been 
instructed to transmit a copy-protected system 
stream of content. 

The source initiates the transmission of 
content marked with the copy protection sta- 
tus: “copy-one-generation,” “copy-never,” “no- 
more-copies,” or “copy-free.” 

Upon receiving the content stream, the 
receiver determines the copy protection status. 
If marked “copy never,” the receiver requests 
that the source initiate full authentication. If 
the content is marked “copy once” or “no more 
copies,” the receiver will request full authenti- 
cation if supported, or restricted authentica- 
tion if it isn’t. 

When the source receives the authentica- 
tion request, it proceeds with the requested 
type of authentication. If full authentication is 
requested but the source can only support 
restricted authentication, then restricted 
authentication is used. 

Once the devices have completed the 
authentication procedure, a content-channel 
encryption key (content key) is exchanged 
between them. This key is used to encrypt the 
content at the source device and decrypt the 
content at the receiver. 

1394 Open Host Controller Interface 
(OHCI) 

The 1394 Open Host Controller Interface 
(OHCI) specification is an implementation of 



the 1394 link layer, with additional features to 
support the transaction and bus management 
layers. It provides a standardized way of inter- 
acting with the 1394 network. 

Home AV Interoperability (HAVi) 

Home AV Interoperability (HAVi) is 
another layer of protocols for 1394. HAVi is 
directed at making 1394 devices plug-and-play 
interoperable in a 1394 network whether or not 
a PC host is present. 

Serial Bus Protocol (SBP-2) 

The ANSI Serial Bus Protocol 2 (SBP-2) 
defines standard way of delivering command 
and status packets over 1394 for devices such 
DVD players, printers, scanners, hard drives, 
and other devices. 

IEC 61883 Specifications 

Certain types of isochronous signals, such 
as MPEG-2 and the IEC 61834, SMPTE 314M, 
and ITU-R BT.1618 digital video (DV) stan- 
dards, use specific data transport protocols 
and formats. When this data is sent isochro- 
nously over a 1394 network, special packetiza- 
tion techniques are used. 

The IEC 61883 series of specifications 
define the details for transferring various appli- 
cation-specific data over 1394: 

IEC 61883-1 = General specification 

IEC 61883-2 = SD-DVCR data transmission 25 
Mbps continuous bit-rate 

IEC 61883-3 = HD-DVCR data transmission 

IEC 61883-4 = MPEG-2 TS data transmission bit- 
rate bursts up to 44 Mbps 

IEC 61883-5 = SDL-DVCR data transmission 

IEC 61883-6 = Audio and music data transmission 

IEC 61883-7 = Transmission of ITU-R BO. 1294 
System B 
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IEC 61883-1 

IEC 61883-1 defines the general structure 
for transferring digital audio and video data 
over 1394. It describes the general packet for- 
mat, data flow management, and connection 
management for digital audio and video data, 
and also the general transmission rules for 
control commands. 

A common isochronous packet (CIP) 
header is placed at the beginning of the data 
field of isochronous data packets, as shown in 
Figure 6.57. It specifies the source node, data 
block size, data block count, time stamp, type 
of real-time data contained in the data field, etc. 

A connection management procedure 
(CMP) is also defined for making isochronous 
connections between devices. 

In addition, a functional control protocol 
(FCP) is defined for exchanging control com- 
mands over 1394 using asynchronous data. 



IEC 61883-2 

IEC 61883-2 and SMPTE 396M define the 
CIP header, data packet format, and transmis- 
sion timing for IEC 61834, SMPTE 314M, and 
ITU-R BT.1618 digital video (DV) standards 
over 1394. Active resolutions of 720 x 480 (at 
29.97 frames per second) and 720 x 576 (at 25 
frames per second) are supported. 

DV data packets are 488 bytes long, made 
up of 8 bytes of CIP header and 480 bytes of 
DV data, as shown in Figure 6.57. Figure 6.58 
illustrates the frame data structure. 

Each of the 720 x 480 4:1:1 YCbCr frames 
are compressed to 103,950 bytes, resulting in a 
4.9:1 compression ratio. Including overhead 
and audio increases the amount of data to 
120,000 bytes. 



NORMAL 

ISOCHRONOUS 

PACKET 



61883-2 

ISOCHRONOUS 

PACKET 







Figure 6.57. 61883-2 Isochronous Packet Formatting. 
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Figure 6.58. IEC 61834, SMPTE 314M, and ITU-R BT.1618 Packet Formatting for 720 x 480 
Systems (4:1:1 YCbCr). 
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The compressed 720 x 480 frame is 
divided into 10 DIF (data in frame) sequences. 
Each DIF sequence contains 150 DIF blocks of 
80 bytes each, used as follows: 

135 DIF blocks for video 

9 DIF blocks for audio 

6 DIF blocks used for Header, Subcode, and 
Video Auxiliary (VAUX) information 

Figure 6.59 illustrates the DIF sequence 
structure in detail. The audio DIF blocks con- 
tain both audio data and audio auxiliary data 
(AAUX). IEC 61834 supports four 32-kHz, 12- 
bit nonlinear audio signals or two 48-, 44. 1-, or 
32-kHz, 16-bit audio signals. SMPTE 314M and 
ITU-R BT.1618 at 25 Mbps support two 48-kHz 
16-bit audio signals, while the 50 Mbps version 
supports four. Video auxiliary data (VAUX) 
DIF blocks include recording date and time, 
lens aperture, shutter speed, color balance, 
and other camera setting data. The subcode 
DIF blocks store a variety of information, the 
most important of which is timecode. 

Each video DIF block contains 80 bytes of 
compressed macroblock data: 

3 bytes for DIF block ID information 

1 byte for the header that includes the quantiza- 
tion number (QNO) and block status (STA) 

14 bytes each for Y0, Yl, Y2, and Y3 

10 bytes each for Cb and Cr 

As the 488-byte packets come across the 
1394 network, the start of a video frame is 
determined. Once the start of a frame is 
detected, 250 valid packets of data are col- 
lected to have a complete DV frame; each 
packet contains 6 DIF blocks of data. Every 
15th packet is a null packet and should be dis- 
carded. Once 250 valid packets of data are in 
the buffer, discard the CIP headers. If all went 



well, you have a frame buffer with a 120,000 
byte compressed DV frame in it. 

720 x 576 frames may use either the 4:2:0 
YCbCr format (IEC 61834) or the 4:1:1 YCbCr 
format (SMPTE 314M and ITU-R BT.1618), 
and require 12 DIF sequences. Each 720 x 576 
frame is compressed to 124,740 bytes. Includ- 
ing overhead and audio increases the amount 
of data to 144,000 bytes, requiring 300 packets 
to transfer. 

Note that the organization of data trans- 
ferred over 1394 differs from the actual DV 
recording format since error correction is not 
required for digital transmission. In addition, 
although the video blocks are numbered in 
sequence in Figure 6.59, the sequence does 
not correspond to the left-to-right, top-to-bot- 
tom transmission of blocks of video data. Com- 
pressed macroblocks are shuffled to minimize 
the effect of errors and aid in error conceal- 
ment. Audio data also is shuffled. Data is trans- 
mitted in the same shuffled order as recorded. 

To illustrate the video data shuffling, DV 
video frames are organized as 50 superblocks, 
with each superblock being composed of 27 
compressed macroblocks, as shown in Figure 
6.60. A group of 5 superblocks (one from each 
superblock column) make up one DIF 
sequence. Table 6.28 illustrates the transmis- 
sion order of the DIF blocks. Additional infor- 
mation on the DV data structure is available in 
Chapter 11. 

IEC 61883-4 

IEC 61883-4 defines the CIP header, data 
packet format, and transmission timing for 
MPEG-2 transport streams over 1394. 

It is most efficient to carry an integer num- 
ber of 192 bytes (188 bytes of MPEG-2 data 
plus 4 bytes of time stamp) per isochronous 
packet, as shown in Figure 6.61. However, 
MPEG data rates are rarely integer multiples 
of the isochronous data rate. Thus, it is more 
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Figure 6.59. IEC 61834, SMPTE 314M, and ITU-R BT.1618 DIF Sequence Detail (25 Mbps) 
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Figure 6.60. Relationship Between Superblocks and Macroblocks (720 x 480, 4:1:1 YCbCr) 
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Table 6.28. Video DIF Blocks and Compressed Macroblocks for 25 Mbps. 
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efficient to divide the MPEG packets into 
smaller components of 24 bytes each to maxi- 
mize available bandwidth. The transmitter 
then uses an integer number of data blocks 
(restricted multiples of 0, 1, 2, 4, or 8) placing 
them in an isochronous packet and adding the 
8-byte CIP header. 

50 Mbps DV 

Like the 25 Mbps DY format, the 50 Mbps 
DY format supports 720 x 480i30 and 720 x 
576i25 sources. However, the 50 Mbps DV for- 
mat uses 4:2:2 YCbCr rather than 4:1:1 YCbCr. 

As previously discussed, the source packet 
size for the 25 Mbps DV streams is 480 bytes 
(consisting of 6 DIF blocks) . The 250 packets 
(300 packets for 576i25 systems) of 480-byte 
data are transferred over a 25 Mbps channel. 

The source packet size for the 50 Mbps DV 
streams is 960 bytes (consisting of 12 DIF 
blocks). The first 125 packets (150 packets for 
576i25 systems) of 960-byte data are sent over 
one 25 Mbps channel and the next 125 packets 
(150 packets for 576i25 systems) of 960-byte 
data are sent over a second 25 Mbps channel. 



100 Mbps DV 

100 Mbps DV streams support 1920 x 
1080i30, 1920 x 1080i25, and 1280 x 720p60 
sources. 1920 x 1080i30 sources are horizon- 
tally scaled to 1280 x 1080i30. 1920 x 1080i25 
sources are horizontally scaled to 1440 x 
1080i25. 1280 x 720p60 sources are horizon- 
tally scaled to 960 x 720p60. The 4:2:2 YCbCr 
format is used. 

The source packet size for the 100 Mbps 
DV streams is 1920 bytes (consisting of 24 DIF 
blocks). The first 63 packets (75 packets for 
1080i25 systems) of 1920-byte data are sent 
over one 25 Mbps channel, the next 62 packets 
(75 packets for 1080i25 systems) of 1920-byte 
data are sent over a second 25 Mbps channel, 
the next 63 packets (75 packets for 1080i25 
systems) of 1920-byte data are sent over a third 
25 Mbps channel, and the last 62 packets (75 
packets for 1080i25 systems) of 1920-byte data 
are sent over a fourth 25 Mbps channel. 
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ISOCHRONOUS 

PACKET 







Figure 6.61. 61883-4 Isochronous Packet Formatting. 
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Digital Camera Specification 

The 1394 Trade Association has written a 
specification for 1394-based digital video cam- 
eras. This was done to avoid the silicon and 
software cost of implementing the full IEC 
61883 specification. 

Seven resolutions are defined, with a wide 
range of format support: 



160 x 120 


4:4:4 YCbCr 


320 x 240 


4:2:2 YCbCr 


640 x 480 


4:1:1, 4:2:2 YCbCr, 24-bit RGB 


800 x 600 


4:2:2 YCbCr, 24-bit RGB 


1024 x 768 


4:2:2 YCbCr, 24-bit RGB 


1280 x 960 


4:2:2 YCbCr, 24-bit RGB 


1600 x 1200 


4:2:2 YCbCr, 24-bit RGB 



Supported frame rates are 1.875, 3.75, 7.5, 15, 
30, and 60 frames per second. 

Isochronous packets are used to transfer 
the uncompressed digital video data over the 
1394 network. 
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Chapter 7 



Digital Video 
Processing 



In addition to encoding and decoding 
MPEG, NTSC/PAL, and many other types of 
video, a typical system usually requires consid- 
erable additional video processing. 

Since many consumer displays, and most 
computer displays, are progressive (noninter- 
laced), interlaced video must be converted to 
progressive (“deinterlaced”) . Progressive 
video must be converted to interlaced to drive 
a conventional analog VCR or interlaced TV, 
requiring noninterlaced-to-interlaced conver- 
sion. 

Many computer displays support refresh 
rates up to at least 75 frames per second. CRT- 
based televisions have a refresh rate of 50 or 
59.94 (60/1.001) fields per second. Refresh 
rates of up to 120 frames per second are 
becoming common for flat-panel televisions. 
For fi lm-based compressed content, the source 
may only be 24 frames per second. Thus, some 
form of frame rate conversion must be done. 



Another not-so-subtle problem includes 
video scaling. SDTV and HDTV support multi- 
ple resolutions, yet the display may be a single, 
fixed resolution. 

Alpha mixing and chroma keying are used 
to mix multiple video signals or video with 
computer-generated text and graphics. Alpha 
mixing ensures a smooth crossover between 
sources, allows subpixel positioning of text, 
and limits source transition bandwidths to sim- 
plify eventual encoding to composite video sig- 
nals. 

Since no source is perfect, even digital 
sources, user controls for adjustable bright- 
ness, contrast, saturation, and hue are always 
desirable. 
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Rounding Considerations 

When two 8-bit values are multiplied 
together, a 16-bit result is generated. At some 
point, a result must be rounded to some lower 
precision (for example, 16 bits to 8 bits or 32 
bits to 16 bits) in order to realize a cost-effec- 
tive hardware implementation. There are sev- 
eral rounding techniques: truncation, 

conventional rounding, error feedback round- 
ing, and dynamic rounding. 

Truncation 

Truncation drops any fractional data dur- 
ing each rounding operation. As a result, after 
only a few operations, a significant error may 
be introduced. This may result in contours 
being visible in areas of solid colors. 

Conventional Rounding 

Conventional rounding uses the fractional 
data bits to determine whether to round up or 
round down. If the fractional data is 0.5 or 
greater, rounding up should be performed — 
positive numbers should be made more posi- 
tive and negative numbers should be made 
more negative. If the fractional data is less than 
0.5, rounding down should be performed — 



positive numbers should be made less positive 
and negative numbers should be made less 
negative. 

Error Feedback Rounding 

Error feedback rounding follows the prin- 
ciple of “never throw anything away.” This is 
accomplished by storing the residue of a trun- 
cation and adding it to the next video sample. 
This approach substitutes less visible noise- 
like quantizing errors in place of contouring 
effects caused by simple truncation. An exam- 
ple of an error feedback rounding implementa- 
tion is shown in Figure 7.1. In this example, 16 
bits are reduced to 8 bits using error feedback. 

Dynamic Rounding 

This technique (a licensable Quantel 
patent) dithers the LSB according to the 
weighting of the discarded fractional bits. The 
original data word is divided into two parts, 
one representing the resolution of the final out- 
put word and one dealing with the remaining 
fractional data. The fractional data is compared 
to the output of a random number generator 
equal in resolution to the fractional data. The 
output of the comparator is a l-bit random pat- 
tern weighted by the value of the fractional 
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Figure 7.1. Error Feedback Rounding. 
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data, and serves as a carry-in to the adder. In 
all instances, only one LSB of the output word 
is changed, in a random fashion. An example of 
a dynamic rounding implementation is shown 
in Figure 7.2. 

SDTV-HDTV YCbCr 
Transforms 

SDTV and HDTV applications have differ- 
ent colorimetric characteristics, as discussed 
in Chapter 3. Thus, when SDTV (HDTV) data 
is displayed on an HDTV (SDTV) display, the 
YCbCr data should be processed to compen- 
sate for the different colorimetric characteris- 
tics. 

SDTV to HDTV 

A 3 x 3 matrix can be used to convert from 
YgoiCbCr (SDTV) to Y 709 CbCr (HDTV): 



1 - 0.11554975 - 0.20793764 
0 1.01863972 0.11461795 
0 0.07504945 1. 02532707 _ 

Note that before processing, the 8-bit DC off- 
set (16 for Y and 128 for CbCr) must be 
removed, then added back in after processing. 

HDTV to SDTV 

A 3 x 3 matrix can be used to convert from 
Y 709 CbCr (HDTV) to Y 601 CbCr (SDTV): 

1 0.09931166 0.19169955 
0 0.98985381 - 0.11065251 
0 - 0.07245296 0.98339782 _ 

Note that before processing, the 8-bit DC off- 
set (16 for Y and 128 for CbCr) must be 
removed, then added back in after processing. 
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Figure 7.2. Dynamic Rounding. 
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4:4:4 to 4:2:2 YCbCr 
Conversion 

Converting 4:4:4 YCbCr to 4:2:2 YCbCr 
(Figure 7.3) is a common function in digital 
video. 4:2:2 YCbCr is the basis for many digital 
video interfaces, and requires fewer connec- 
tions to implement than 4:4:4. 

Saturation logic should be included in the 
Y, Cb, and Cr data paths to limit the 8-bit range 
to 1-254. The 16 and 128 values shown in Fig- 
ure 7.3 are used to generate the proper levels 
during blanking intervals. 

Y Filtering 

A template for the Y lowpass fdter is shown 
in Figure 7.4 and Table 7.1. 

Because there may be many cascaded con- 
versions (up to 10 were envisioned), the filters 
were designed to adhere to very tight toler- 



ances to avoid a buildup of visual artifacts. 
Departure from flat amplitude and group delay 
response due to filtering is amplified through 
successive stages. For example, if filters exhib- 
iting -1 dB at 1 MHz and -3 dB at 1.3 MHz 
were employed, the overall response would be 
-8 dB (at 1 MHz) and -24 dB (at 1.3 MHz) 
after four conversion stages (assuming two fil- 
ters per stage). 

Although the sharp cut-off results in ring- 
ing on Y edges, the visual effect should be min- 
imal provided that group-delay performance is 
adequate. When cascading multiple filtering 
operations, the passband flatness and group- 
delay characteristics are very important. The 
passband tolerances, coupled with the sharp 
cut-off, make the template very difficult (some 
say impossible) to match. As a result, there is 
usually a temptation to relax passband accu- 
racy, but the best approach is to reduce the 
rate of cut-off and keep the passband as flat as 
possible. 



24-BIT 

4:4:4 



16-BIT 8-BIT 

4:2:2 4:2:2 



YCBCR 



YCBCR 



YCBCR 
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Figure 7.3. 4:4:4 to 4:2:2 YCbCr Conversion. 
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ATTENUATION 

(DB) 



50 DB 




FREQUENCY (MHZ) 



Figure 7.4. Y Filter Template. F s = Y lx sample rate. 



Frequency Range 


Typical SDTV Tolerances 


Typical HDTV Tolerances 


Passband Ripple Tolerance 


0 to 0.40F S 


+0.01 dB increasing to +0.05 dB 


±0.05 dB 


Passband Group Delay Tolerance 


0 to 0.27F S 


0 increasing to ±1.35 ns 


+0.075T 


0.27F S to 0.40F S 


±1.35 ns increasing to ±2 ns 


+0.110T 



Table 7.1. Y Filter Ripple and Group Delay Tolerances. F s = Y lx sample 
rate. T = 1 / F s . 
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ATTENUATION 

(DB) 




FREQUENCY (MHZ) 



Figure 7.5. Cb and Cr Filter Template for Digital Filter for Sample Rate 
Conversion from 4:4:4 to 4:2:2. F s = Y lx sample rate. 



Frequency Range 


Typical SDTV Tolerances 


Typical HDTV Tolerances 


Passband Ripple Tolerance 


0 to 0.20F S 


0 dB increasing to +0.05 dB 


+0.05 dB 


Passband Group Delay Tolerance 


0 to 0.20F S 


delay distortion is zero by design 



Table 7.2. CbCr Filter Ripple and Group Delay Tolerances. F s = Y lx sample 
rate. T = 1 / F s . 
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CbCr Filtering 

Cb and Cr are lowpass filtered and deci- 
mated. In a standard design, the lowpass and 
decimation filters may be combined into a sin- 
gle filter, and a single filter may be used for 
both Cb and Cr by multiplexing. 

As with Y filtering, the Cb and Cr lowpass 
filtering requires a sharp cut-off to prevent 
repeated conversions from producing a cumu- 
lative resolution loss. However, due to the low 
cut-off frequency, the sharp cut-off produces 
ringing that is more noticeable than for Y. 

A template for the Cb and Cr filters is 
shown in Figure 7.5 and Table 7.2. 

Since aliasing is less noticeable in color dif- 
ference signals, the attenuation at half the sam- 
pling frequency is only 6 dB. There is an 
advantage in using a skew-symmetric response 
passing through the -6 dB point at half the 
sampling frequency — this makes alternate 
coefficients in the digital filter zero, almost 
halving the number of taps, and also allows 
using a single digital filter for both the Cb and 
Cr signals. Use of a transversal digital filter has 
the advantage of providing perfect linear phase 
response, eliminating the need for group-delay 
correction. 

As with the Y filter, the passband flatness 
and group-delay characteristics are very 
important, and the best approach again is to 
reduce the rate of cut-off and keep the pass- 
band as flat as possible. 



Display Enhancement 

Brightness , Contrast , Saturation 
(Color), and Hue (Tint) 

Working in the YCbCr color space simpli- 
fies the implementation of brightness, contrast, 
saturation, and hue controls, as shown in Fig- 



ure 7.6. Also illustrated are multiplexers to 
allow the output of black screen, blue screen, 
and color bars. 

The design should ensure that no overflow 
or underflow wraparound errors occur, effec- 
tively saturating results to the 0 and 255 values. 

Y Processing 

16 is subtracted from the Y data to position 
the black level at zero. This removes the DC 
offset so adjusting the contrast does not vary 
the black level. Since the Y input data may 
have values below 16, negative Y values should 
be supported at this point. 

The contrast (or picture or white level) con- 
trol is implemented by multiplying the YCbCr 
data by a constant. If Cb and Cr are not 
adjusted, a color shift will result whenever the 
contrast is changed. A typical 8-bit contrast 
adjustment range is 0-1.992x. 

The brightness (or black level) control is 
implemented by adding or subtracting from 
the Y data. Brightness is done after the con- 
trast to avoid introducing a varying DC offset 
due to adjusting the contrast. A typical 8-bit 
brightness adjustment range is -128 to +127. 

Finally, 16 is added to position the black 
level at 16. 

CbCr Processing 

128 is subtracted from Cb and Cr to posi- 
tion the range about zero. 

The hue (or tint) control is implemented 
by mixing the Cb and Cr data: 

Cb' = Cb cos 9 + Cr sin 0 

Cr'= Cr cos 0 - Cb sin 0 

where 9 is the desired hue angle. A typical 8-bit 
hue adjustment range is -30° to +30°. 

The saturation (or color) control is imple- 
mented by multiplying both Cb and Cr by a 
constant. A typical 8-bit saturation adjustment 
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00 = BLACK SCREEN 

01 = BLUE SCREEN 

10 = COLOR BARS 

11 = NORMAL VIDEO 




Figure 7.6. Hue, Saturation, Contrast, and Brightness Controls 
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range is 0-1.992x. In the example shown in 
Figure 7.6, the contrast and saturation values 
are multiplied together to reduce the number 
of multipliers in the CbCr datapath. 

Finally, 128 is added to both Cb and Cr. 

Many displays also use separate hue and 
saturation controls for each of the red, green, 
blue, cyan, yellow, and magenta colors. This 
enables tuning the image at production time to 
better match the display’s characteristics. 

Color Transient Improvement 

YCbCr transitions should be aligned. How- 
ever, the Cb and Cr transitions are usually 
slower and time-offset due to the narrower 
bandwidth of color difference information. 

By monitoring coincident Y transitions, 
faster horizontal and vertical transitions may 
be synthesized for Cb and Cr. Small pre- and 
after-shoots may also be added to the Cb and 
Cr signals. 

The new Cb and Cr edges are then aligned 
with the Y edge, as shown in Figure 7.7. 

Displays commonly use this technique to 
provide a sharper-looking picture. 



150 NS 




150 NS 



Luma Transient Improvement 

In this case, the Y horizontal and vertical 
transitions are shortened, and small pre- and 
after-shoots may also be added, to artificially 
sharpen the image. 

Displays commonly use this technique to 
provide a sharper-looking picture. 

Sharpness 

The apparent sharpness of a picture may 
be increased by increasing the amplitude of 
high-frequency luminance information. 

As shown in Figure 7.8, a simple bandpass 
filter with selectable gain (also called a peaking 
filter) may be used. The frequency where max- 
imum gain occurs is usually selectable to be 
either at the color subcarrier frequency or at 
about 2.6 MHz. A coring circuit is typically 
used after the filter to reduce low-level noise. 

Figure 7.9 illustrates a more complex 
sharpness control circuit. The high-frequency 
luminance is increased using a variable band- 
pass filter, with adjustable gain. The coring 
function (typically +1 LSB) removes low-level 
noise. The modified luminance is then added 
to the original luminance signal. 

In addition to selectable gain, selectable 
attenuation of high frequencies should also be 
supported. Many televisions boost high-fre- 
quency gain to improve the apparent sharp- 
ness of the picture. Although the sharpness 
control on the television may be turned down, 
this affects the picture quality of analog broad- 
casts. 



Figure 7.7. Color Transient Improvement. 
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Figure 7.9. More Complex Sharpness Control. (A) Typical implementation. (B) Coring function. 
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Blue Stretch 

Blue stretch increases the blue value of 
white and near-white colors in order to make 
whites appear brighter. When applying blue 
stretch, only colors only within a specified 
color range should be processed. 

Colors with a Y value of -80% or more of 
the maximum, have a low saturation value, and 
fall within a white detection area in the CbCr- 
plane, have their blue components increased 
by -4% (the blue gain factor) and their red 
components decreased the same amount. For 
more complex designs, the white detection 
area and blue gain factor can be dependent on 
the color’s Y value and saturation level. 

A transition boundary can be used around 
the white detection area for gradually decreas- 
ing the blue gain factor as colors move away 
from the white detection area boundary. This 
can prevent hard transitions between areas 
that are blue stretched and areas that are not. 
If a color falls inside the transition boundary 
area, it is blue stretched using a fraction of the 
blue gain factor, with the fraction decreasing as 
the distance from the edge of the detection 
area boundary increases. 

Green Enhancement 

Green enhancement creates a richer, more 
saturated green color when the level of green 
is low. Displays commonly use this technique 
to provide greener looking grass, plants, etc. 
When applying green enhancement, only col- 
ors only within a specified color range should 
be processed. 

Colors with a low green saturation value, 
and fall within a green detection area in the 
CbCr-plane, have their saturation increased. 
Rather then centering the green detection area 



about the green axis (241° in Figure 9.28) 
some designs use -213° for the green detec- 
tion axis so the same design can also easily be 
used to implement skin tone correction. 

Simple implementations have the maxi- 
mum saturation gain (~1.2x) occurring on the 
green detection axis, with the saturation gain 
decreasing to lx as the distance from the 
green detection axis increases. For more com- 
plex designs, the green detection area and 
maximum saturation gain can be dependent on 
the color’s Y value and saturation level 

Some displays also use this technique to 
implement blue enhancement, used to make 
the sky appear more blue. 

Dynamic Contrast 

Using dynamic contrast (also called adap- 
tive contrast enhancement), the differences 
between dark and light portions of the image 
are artificially enhanced based on the content 
in the image. Displays commonly use this tech- 
nique to improve their contrast ratio. 

Bright colors in mostly dark images are 
enhanced by making them brighter (white 
stretch) . This is typically done by using histo- 
gram information to modify the upper portion 
of the gamma curve. 

Dark colors in mostly light images are 
enhanced by making them darker (black 
stretch) . This is typically done by using histo- 
gram information to modify the lower portion 
of the gamma curve. 

For a medium-bright image, both tech- 
niques may be applied. 

A minor gamma correction adjustment 
may also be applied to colors that are between 
dark and light, resulting in a more detailed and 
contrasting picture. 
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Color Correction 

The RGB chromaticities are usually 
slightly different between the source video and 
what the display uses. This results in red, 
green and blue colors that are not completely 
accurate. 

Color correction can be done on the 
source video to compensate for the display 
characteristics, enabling more accurate red, 
green and blue colors to be displayed. 

An alternate type of color correction is to 
perform color expansion, taking advantage of 
the greater color reproduction capabilities of 
modern displays. This can result in greener 
greens, bluer blues, etc. One common tech- 
nique of implementing color expansion is to 
use independent hue and saturation controls 
for each primary and complementary color, 
plus the skin color. 

Color Temperature Correction 

In an uncalibrated television, the color 
temperature (white color) varies based on the 
brightness level. 

The color temperature of D 65 , the white 
point specified by most video standards, is 
6500 °K. Color temperatures above 6500 °K. 
are more bluish (cool); color temperatures 
below 6500 °K. are more reddish (warm). 



Many televisions ship from the factory 
with a very high average color temperature 
(7000-8000 °K.) to emphasize the brightness 
of the set. Viewers can select from two or three 
factory presets (warm, cool, etc.) or viewing 
modes (movies, sports, etc.) which are a refer- 
ence to the color temperature. A “cool” setting 
is brighter (like what you see in midday light) 
and is better for daylight viewing, such as 
sporting events, because of the enhanced 
brightness. A “warm” setting is softer (like 
what you see in a softly lit indoor environment) 
and is better for viewing movies, or in dark- 
ened environments. 

The color temperature may be finely 
adjusted by using a 3 x 3 matrix multiplier to 
process the YCbCr or RGB' data. 10 registers 
(one for every 10 IRE step from 10-100 IRE) 
provide the nine coefficients for the 3x3 
matrix multiplier. The values of the registers 
are determined by a calibrating process. 
YCbCr or R G B' values for intermediate IRE 
levels may be determined using interpolation. 
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Video Mixing and Graphics 
Overlay 

Mixing video signals may be as simple as 
switching between two video sources. This is 
adequate if the resulting video is to be dis- 
played on a computer monitor. 

For most other applications, a technique 
known as alpha mixing should be used. Alpha 
mixing may also be used to fade to or from a 
specific color (such as black) or to overlay 
computer-generated text and graphics onto a 
video signal. 

Alpha mixing must be used if the video is 
to be encoded to composite video. Otherwise, 
ringing and blurring may appear at the source 
switching points, such as around the edges of 
computer-generated text and graphics. This is 
due to the color information being lowpass fil- 
tered within the NTSC/PAL encoder. If the fil- 
ters have a sharp cut-off, a fast color transition 
will produce ringing. In addition, the intensity 
information may be bandwidth-limited to about 
4-5 MHz somewhere along the video path, 
slowing down intensity transitions. 

Mathematically, with alpha normalized to 
have values of 0-1, alpha mixing is imple- 
mented as: 

out= (alpha_0) (in_0) + (alpha_l) (in_l) + ... 

In this instance, each video source has its own 
alpha information. The alpha information may 
not total to one (unity gain) . 

Figure 7.10 shows mixing of two YCbCr 
video signals, each with its own alpha informa- 
tion. As YCbCr uses an offset binary notation, 
the offset (16 for Y and 128 for Cb and Cr) is 
removed prior to mixing the video signals. 
After mixing, the offset is added back in. Note 
that two 4:2:2 YCbCr streams may also be pro- 
cessed directly; there is no need to convert 



them to 4:4:4 YCbCr, mix, then convert the 
result back to 4:2:2 YCbCr. 

When only two video sources are mixed 
and alpha_0 + alpha_l = 1 (implementing a 
crossfader) , a single alpha value may be used 
mathematically shown as: 

out = (alpha) (in_0) + (1 - alpha) (in_l) 

When alpha = 0, the output is equal to the in_l 
video signal; when alpha = 1, the output is 
equal to the in_0 video signal. When alpha is 
between 0 and 1, the two video signals are pro- 
portionally multiplied, and added together. 

Expanding and rearranging the previous 
equation shows how a two-channel mixer may 
be implemented using a single multiplier: 

out = (alpha) (in_0 - in_l) + in_l 

Fading to and from a specific color is done by 
setting one of the input sources to a constant 
color. 

Figure 7.11 illustrates mixing two YCbCr 
sources using a single alpha channel. Figures 
7.12 and 7.13 illustrate mixing two RG B' 
video sources (R G B' has a range of 0-255). 
Figures 7.14 and 7.15 show mixing two digital 
composite video signals. 

A common problem in computer graphics 
systems that use alpha is that the frame buffer 
may contain preprocessed RGB' or YCbCr 
data; that is, the R G B' or YCbCr data in the 
frame buffer has already been multiplied by 
alpha. Assuming an alpha (A) value of 0.5, non- 
processed R G B A values for white are (255, 
255, 255, 128); preprocessed R'G'B A values 
for white are (128, 128, 128, 128). Therefore, 
any mixing circuit that accepts RG B' or 
YCbCr data from a frame buffer should be able 
to handle either format. 

By adjusting the alpha values, slow to fast 
crossfades are possible, as shown in Figure 





Figure 7.10. 



Alpha Channel. 
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Figure 7.11. Simplified Mixing (Crossfading) of Two YCbCr Video 
Signals Using a Single Alpha Channel. 
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Figure 7.12. Mixing Two RGB Video Signals (RGB Has a 
Range of 0-255) , Each with Its Own Alpha Channel. 
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Figure 7.13. Simplified Mixing (Crossfading) of Two RGB Video 
Signals (RGB Has a Range of 0-255) Using a Single Alpha Channel. 
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Figure 7.14. Mixing Two Digital Composite Video Signals, Each with Its Own Alpha Channel. 




Figure 7.15. Simplified Mixing (Crossfading) of Two Digital Composite 
Video Signals Using a Single Alpha Channel. 
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Figure 7.16. Controlling Alpha Values to Implement (A) Fast or (B) 
Slow Keying. In (A), the effective switching point lies between two 
samples. In (B), the transition is wider and is aligned at a sample 
instant. 
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7.16. Large differences in alpha between sam- 
ples result in a fast crossfade; smaller differ- 
ences result in a slow crossfade. If using alpha 
mixing for special effects, such as wipes, the 
switching point (where 50% of each video 
source is used) must be able to be adjusted to 
an accuracy of less than one sample to ensure 
smooth movement. By controlling the alpha 
values, the switching point can be effectively 
positioned anywhere, as shown in Figure 
7.16a. 

Text can be overlaid onto video by having a 
character generator control the alpha inputs. 
By setting one of the input sources to a con- 
stant color, the text will assume that color. 

Note that for those designs that subtract 16 
(the black level) from the Y channel before 
processing, negative Y values should be sup- 
ported after the subtraction. This allows the 
design to pass through real-world and test 
video signals with minimum artifacts. 



Luma and Chroma Keying 

Keying involves specifying a desired fore- 
ground color; areas containing this color are 
replaced with a background image. Alter- 
nately, an area of any size or shape may be 
specified; foreground areas inside (or outside) 
this area are replaced with a background 
image. 

Luminance Keying 

Luminance keying involves specifying a 
desired foreground luminance level; fore- 
ground areas containing luminance levels 
above (or below) the keying level are replaced 
with the background image. 

Alternately, this hard keying implementa- 
tion may be replaced with soft keying by speci- 



fying two luminance values of the foreground 
image: Y^ and Yp (Yp < Y H ). For keying the 
background into white foreground areas, fore- 
ground luminance values (Yp G ) above Y H are 
replaced with the background image; Ypp val- 
ues below Yp contain the foreground image. 
For YpQ values between Yp and y H> linear mix- 
ing is done between the foreground and back- 
ground images. This operation may be 
expressed as: 

ifY FG >Y H 

K = 1 = background only 

ifY FG <Y L 

K = 0 = foreground only 

ifY H >Y FG >Y L 

K = (Y fg - Yp)/ (Y h - Yp) = mix 

By subtracting K from 1, the new lumi- 
nance keying signal for keying into black fore- 
ground areas can be generated. 

Figure 7.17 illustrates luminance keying 
for two YCbCr sources. Although chroma key- 
ing typically uses a suppression technique to 
remove information from the foreground 
image, this is not done when luminance keying 
as the magnitudes of Cb and Cr are usually not 
related to the luminance level. 

Figure 7.18 illustrates luminance keying 
for R'G'B' sources, which is more applicable 
for computer graphics. Ypg may be obtained 
by the equation: 

Y fg = 0.299R' + 0.587G' + 0.114B' 

In some applications, the red and blue data is 
ignored, resulting in YpQ being equal to only 
the green data. 

Figure 7.19 illustrates one technique of 
luminance keying between two digital compos- 
ite video sources. 
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Figure 7.19. Luminance Keying of Two Digital Composite Video Signals. 



Chroma Keying 

Chroma keying involves specifying a 
desired foreground key color; foreground 
areas containing the key color are replaced 
with the background image. Cb and Cr are 
used to specify the key color; luminance infor- 
mation may be used to increase the realism of 
the chroma keying function. The actual mixing 
of the two video sources may be done in the 
component or composite domain, although 
component mixing reduces artifacts. 

Early chroma keying circuits simply per- 
formed a hard or soft switch between the fore- 
ground and background sources. In addition to 
limiting the amount of fine detail maintained in 
the foreground image, the background was not 
visible through transparent or translucent fore- 



ground objects, and shadows from the fore- 
ground were not present in areas containing 
the background image. 

Linear keyers were developed that com- 
bine the foreground and background images 
in a proportion determined by the key level, 
resulting in the foreground image being atten- 
uated in areas containing the background 
image. Although allowing foreground objects 
to appear transparent, there is a limit on the 
fineness of detail maintained in the fore- 
ground. Shadows from the foreground are not 
present in areas containing the background 
image unless additional processing is done — 
the luminance levels of specific areas of the 
background image must be reduced to create 
the effect of shadows cast by foreground 
objects. 
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If the blue or green backing used with the 
foreground scene is evenly lit except for shad- 
ows cast by the foreground objects, the effect 
on the background will be that of shadows cast 
by the foreground objects. This process, 
referred to as shadow chroma keying, or lumi- 
nance modulation, enables the background 
luminance levels to be adjusted in proportion 
to the brightness of the blue or green backing 
in the foreground scene. This results in more 
realistic keying of transparent or translucent 
foreground objects by preserving the spectral 
highlights. 

Note that green backgrounds are now 
more commonly used due to lower chroma 
noise. 

Chroma keyers are also limited in their 
ability to handle foreground colors that are 
close to the key color without switching to the 



background image. Another problem may be a 
bluish tint to the foreground objects as a result 
of blue light reflecting off the blue backing or 
being diffused in the camera lens. Chroma 
spill is difficult to remove since the spill color 
is not the original key color; some mixing 
occurs, changing the original key color 
slightly. 

One solution to many of the chroma key- 
ing problems is to process the foreground and 
background images individually before com- 
bining them, as shown in Figure 7.20. Rather 
than choosing between the foreground and 
background, each is processed individually 
and then combined. Figure 7.21 illustrates the 
major processing steps for both the fore- 
ground and background images during the 
chroma key process. Not shown in Figure 7.20 
is the circuitry to initially subtract 16 (Y) or 
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Figure 7.20. Typical Component Chroma Key Circuit. 
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Figure 7.21. Major Processing Steps During Chroma Keying. (A) 
Original foreground scene. (B) Original background scene. (C) 
Suppressed foreground scene. (D) Background keying signal. (E) 
Background scene after multiplication by background key. (F) 
Composite scene generated by adding (C) and (E). 
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128 (Cb and Cr) from the foreground and 
background video signals and the addition of 
16 (Y) or 128 (Cb and Cr) after the final output 
adder. Any DC offset not removed will be 
amplified or attenuated by the foreground and 
background gain factors, shifting the black 
level. 

The foreground key (Kp G ) and back- 
ground key (Kj» ( ;) signals have a range of 0 to 
1. The garbage matte key signal (the term 
matte comes from the film industry) forces the 
mixer to output the foreground source in one 
of two ways. 

The first method is to reduce K RG in pro- 
portion to increasing Kp (i . This provides the 
advantage of minimizing black edges around 
the inserted foreground. 

The second method is to force the back- 
ground to black for all nonzero values of the 
matte key, and insert the foreground into the 
background hole. This requires a cleanup func- 
tion to remove noise around the black level, as 
this noise affects the background picture due 
to the straight addition process. 

The garbage matte is added to the fore- 
ground key signal (Kpp) using a non-additive 
mixer (NAM). A nonadditive mixer takes the 
brighter of the two pictures, on a sample-by- 
sample basis, to generate the key signal. Mat- 
ting is ideal for any source that generates its 
own keying signal, such as character genera- 
tors, and so on. 

The key generator monitors the fore- 
ground Cb and Cr data, generating the fore- 
ground keying signal, Kpg. A desired key color 
is selected, as shown in Figure 7.22. The fore- 
ground Cb and Cr data are normalized (gener- 
ating Cb' and Cr') and rotated 9 degrees to 
generate the X and Z data, such that the posi- 
tive X axis passes as close as possible to the 
desired key color. Typically, 9 may be varied in 
1° increments, and optimum chroma keying 



occurs when the X axis passes through the key 
color. 

X and Z are derived from Cb and Cr using 
the equations: 

X = Cb' cos 0 + Cr' sin 0 

Z = Cr' cos 0 - Cb' sin 0 

Since Cb' and Cr' are normalized to have a 
range of +1, X and Z have a range of +1. 

The foreground keying signal (Kpp) is 
generated from X and Z and has a range of 9-1: 

K FG = X - (|Z|/ (tan (a/2))) 

K fg = 0 if X < (|Z|/ (tan (a/2))) 

where a is the acceptance angle, symmetri- 
cally centered about the positive X axis, as 
shown in Figure 7.23. Outside the acceptance 
angle, Kpg is always set to zero. Inside the 
acceptance angle, the magnitude of Kp (i lin- 
early increases the closer the foreground color 
approaches the key color and as its saturation 
increases. Colors inside the acceptance angle 
are further processed by the foreground sup- 
pressor. 

The foreground suppressor reduces fore- 
ground color information by implementing X = 
X - Kp(;, with the key color being clamped to 
the black level. To avoid processing Cb and Cr 
when KpQ = 9, the foreground suppressor per- 
forms the operations: 

Cb FG = Cb - K fg cos 0 

Cr FG = Cr - K fg sin 0 

where Cbpg and Cr FG are the foreground Cb 
and Cr values after key color suppression. 
Early implementations suppressed foreground 
information by multiplying Cb and Cr by a 
clipped version of the Kp G signal. This, how- 
ever, generated in-band alias components due 




218 Chapter 7: Digital Video Processing 
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Figure 7.22. Rotating the Normalized Cb and Cr (Ob' and Cr') Axes by 6 to Obtain the X and Z 
Axes, Such That the X Axis Passes Through the Desired Key Color (Blue in This Example). 



z 




Figure 7.23. Foreground Key Values and Acceptance Angle. 
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to the multiplication and clipping process and 
produced a hard edge at key color boundaries. 

Unless additional processing is done, the 
CbpQ and Cr FG components are set to zero 
only if they are exactly on the X axis. Hue vari- 
ations due to noise or lighting will result in 
areas of the foreground not being entirely sup- 
pressed. Therefore, a suppression angle is set, 
symmetrically centered about the positive X 
axis. The suppression angle (P) is typically con- 
figurable from a minimum of zero degrees, to a 
maximum of about one-third the acceptance 
angle (a). Any CbCr components that fall 
within this suppression angle are set to zero. 
Figure 7.24 illustrates the use of the suppres- 
sion angle. 

Foreground luminance, after being nor- 
malized to have a range of 0-1, is suppressed 
by: 

Y fg = Y - Ys k fg 

Yfg = 0 if ys K FG > Y' 

Here, y$ is a programmable value and used to 
adjust Yp(; so that it is clipped at the black level 
in the key color areas. 

The foreground suppressor also removes 
key-color fringes on wanted foreground areas 
caused by chroma spill, the overspill of the key 
color, by removing discolorations of the 
wanted foreground objects. 

Ultimatte® improves on this process by 
measuring the difference between the blue 
and green colors, as the blue backing is never 
pure blue and there may be high levels of blue 
in the foreground objects. Pure blue is rarely 
found in nature, and most natural blues have a 
higher content of green than red. For this rea- 
son, the red, green, and blue levels are moni- 
tored to differentiate between the blue backing 
and blue in wanted foreground objects. 

If the difference between blue and green is 
great enough, all three colors are set to zero to 



produce black; this is what happens in areas of 
the foreground containing the blue backing. 

If the difference between blue and green is 
not large, the blue is set to the green level 
unless the green exceeds red. This technique 
allows the removal of the bluish tint caused by 
the blue backing while being able to reproduce 
natural blues in the foreground. As an exam- 
ple, a white foreground area normally would 
consist of equal levels of red, green, and blue. 
If the white area is affected by the key color 
(blue in this instance), it will have a bluish 
tint — the blue levels will be greater than the 
red or green levels. Since the green does not 
exceed the red, the blue level is made equal to 
the green, removing the bluish tint. 

There is a price to pay, however. Magenta 
in the foreground is changed to red. A green 
backing can be used, but in this case, yellow in 
the foreground is modified. Usually, the clamp- 
ing is released gradually to increase the blue 
content of magenta areas. 

The key processor generates the initial 
background key signal (K'b G ) used to remove 
areas of the background image where the fore- 
ground is to be visible. K'b G is adjusted to be 
zero in desired foreground areas and unity in 
background areas with no attenuation. It is 
generated from the foreground key signal 
(K fg ) by applying lift (k^ and gain (k G ) 
adjustments followed by clipping at zero and 
unity values: 

K'bg = (K F g _ U)V, 

Figure 7.25 illustrates the operation of the 
background key signal generation. The transi- 
tion between K'g G = 0 and K'g G = 1 should be 
made as wide as possible to minimize disconti- 
nuities in the transitions between foreground 
and background areas. 

For foreground areas containing the same 
CbCr values, but different luminance (Y) val- 
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Figure 7.25. Background Key Generation. 



ues, as the key color, the key processor may 
also reduce the background key value as the 
foreground luminance level increases, allow- 
ing turning off the background in foreground 
areas containing a lighter key color, such as 
light blue. This is done by: 

K BG = K BG-y C Y FG 

Kbg = 0 h y c Y FG > k fg 

To handle shadows cast by foreground 
objects, and opaque or translucent foreground 
objects, the luminance level of the blue back- 
ing of the foreground image is monitored. 
Where the luminance of the blue backing is 
reduced, the luminance of the background 
image also is reduced. The amount of back- 
ground luminance reduction must be con- 
trolled so that defects in the blue backing 



(such as seams or footprints) are not inter- 
preted as foreground shadows. 

Additional controls may be implemented to 
enable the foreground and background signals 
to be controlled independently. Examples are 
adjusting the contrast of the foreground so it 
matches the background or fading the fore- 
ground in various ways (such as fading to the 
background to make a foreground object van- 
ish or fading to black to generate a silhouette) . 

In the computer environment, there may 
be relatively slow, smooth edges — especially 
edges involving smooth shading. As smooth 
edges are easily distorted during the chroma 
keying process, a wide keying process is usu- 
ally used in these circumstances. During wide 
keying, the keying signal starts before the 
edge of the graphic object. 
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Composite Chroma Keying 

In some instances, the component signals 
(such as YCbCr) are not directly available. For 
these situations, composite chroma keying 
may be implemented, as shown in Figure 7.26. 

To detect the chroma key color, the fore 
ground video source must be decoded to pro- 
duce the Cb and Cr color difference signals. 
The keying signal, Kpg, is then used to mix 
between the two composite video sources. The 
garbage matte key signal forces the mixer to 
output the background source by reducing 
k FG- 

Chroma keying using composite video sig- 
nals usually results in unrealistic keying, since 
there is inadequate color bandwidth. As a 
result, there is a lack of fine detail, and halos 
may be present on edges. 



Superblack and Luma Keying 

Video systems also may make use of super- 
black or luma keying. Areas of the foreground 
video that have a value within a specified range 
below the blanking level (analog video) or 
black level (digital video) are replaced with the 
background video information. 



CB, CR 




OUTPUT 
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Figure 7.26. Typical Composite Chroma Key Circuit. 






Video Scaling 223 



Video Scaling 

With all the various video resolutions 
(Table 7.3) , scaling is usually needed in almost 
every solution. 

When generating objects that will be dis- 
played on SDTV, computer users must be con- 
cerned with such things as text size, line 
thickness, and so forth. For example, text 
readable on a 1280 x 1024 computer display 
may not be readable on an SDTV display due 
to the large amount of downscaling involved. 
Thin horizontal lines may either disappear 
completely or flicker at a 25 or 29.97 Hz rate 
when converted to interlaced SDTV. 



Note that scaling must be performed on 
component video signals (such as R G B' or 
YCbCr) . Composite color video signals cannot 
be scaled directly due to the color subcarrier 
phase information present, which would be 
meaningless after scaling. 

In general, the spacing between output 
samples can be defined by a Target Increment 
(tarinc) value: 

tarinc = 1/0 

where I and O are the number of input (I) and 
output (O) samples, either horizontally or ver- 
tically. 

The first and last output samples may be 
aligned with the first and last input samples by 
adjusting the equation to be: 

tarinc =(!-!)/ (0-1) 



Displays 


SDTV Sources 


HDTV Sources 


704 x 480 


640 x 480 


704 x 360 1 


704 x 432 1 


1280 x 720 


854 x 480 


800 x 600 


480 x 480 


480 x 576 


1440 x 816 2 


704 x 576 


1024 x 768 


528 x 480 




1440 x 1040 3 


854 x 576 


1280 x 768 


544 x 480 


544 x 576 


1280 x 1080 


1280 x 720 


1366 x 768 


640 x 480 




1440 x 1080 


1280 x 768 


1024 x 1024 


704 x 480 


704 x 576 


1920 x 1080 


1920 x 1080 


1280 x 1024 




768 x 576 





Table 7.3. Common Active Resolutions for Consumer Displays and Broadcast 
Sources. letterbox on a 4:3 display. 2 2.35:1 anamorphic for a 16:9 

1920x1080 display. 3 1.85:1 anamorphic for a 16:9 1920x1080 display. 
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Pixel Dropping and Duplication 

This is also called “nearest neighbor” scal- 
ing since only the input sample closest to the 
output sample is used. 

The simplest form of scaling down is pixel 
dropping, where (m) out of every (n) samples 
are thrown away both horizontally and verti- 
cally. A modified version of the Bresenham 
line-drawing algorithm (described in most 
computer graphics books) is typically used to 
determine which samples not to discard. 

Simple upscaling can be accomplished by 
pixel duplication, where (m) out of every (n) 
samples are duplicated both horizontally and 
vertically. Again, a modified version of the 
Bresenham line-drawing algorithm can be 
used to determine which samples to duplicate. 

Scaling using pixel dropping or duplication 
is not recommended due to the visual artifacts 
and the introduction of aliasing components. 

Linear Interpolation 

An improvement in video quality of scaled 
images is possible using linear interpolation. 
When an output sample falls between two input 
samples (horizontally or vertically) , the output 
sample is computed by linearly interpolating 
between the two input samples. However, scal- 
ing to images smaller than one-half of the origi- 
nal still results in deleted samples. 

Figure 7.27 illustrates the vertical scaling 
of a 16:9 image to fit on a 4:3 display. A simple 
bi-linear vertical filter is commonly used, as 
shown in Figure 7.28a. Two source samples, 
L n and L n+1 , are weighted and added together 
to form a destination sample, D m . 

D 0 = 0.75Lo + 0.254 

Dj = 0.54 + 0.54 
d 2 = 0.254 + 0.754 



However, as seen in Figure 7.28a, this results 
in uneven line spacing, which may result in 
visual artifacts. Figure 7.28b illustrates vertical 
filtering that results in the output lines being 
more evenly spaced: 

Do = 4 

Di = (2/3)4 + (1/3)4 

D 2 = (1/3)4 + (2/3)4 

The linear interpolator is a poor band- 
width-limiting filter. Excess high-frequency 
detail is removed unnecessarily and too much 
energy above the Nyquist limit is still present, 
resulting in aliasing. 

Anti-Aliased Resampling 

The most desirable approach is to ensure 
the frequency content scales proportionally 
with the image size, both horizontally and ver- 
tically. 

Figure 7.29 illustrates the fundamentals of 
an anti-aliased resampling process. The input 
data is upsampled by A and lowpass filtered to 
remove image frequencies created by the 
interpolation process. Filter B bandwidth-limits 
the signal to remove frequencies that will alias 
in the resampling process B. The ratio of B/A 
determines the scaling factor. 

Filters A and B are usually combined into a 
single filter. The response of the filter largely 
determines the quality of the interpolation. 
The ideal lowpass filter would have a very flat 
passband, a sharp cutoff at half of the lowest 
sampling frequency (either input or output), 
and very high attenuation in the stopband. 
However, since such a filter generates ringing 
on sharp edges, it is usually desirable to roll off 
the top of the passband. This makes for 
slightly softer pictures, but with less pro- 
nounced ringing. 
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Figure 7.27. Vertical Scaling of 16:9 Images to Fit on a 
4:3 Display. (A) 480-line systems. (B) 576-line systems. 
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Figure 7.28. 75% Vertical Scaling of 16:9 Images to Fit on a 4:3 Display. 
(A) Unevenly spaced results. (B) Evenly spaced results. 
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Figure 7.29. General Anti-Aliased Resampling Structure. 
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Passband ripple and stopband attenuation 
of the filter provide some measure of scaling 
quality, but the subjective effect of ringing 
means a flat passband might not be as good as 
one might think. Lots of stopband attenuation 
is almost always a good thing. 

There are essentially three variations of 
the general resampling structure. Each com- 
bines the elements of Figure 7.29 in various 
ways. 

One approach is a variable-bandwidth anti- 
aliasing filter followed by a combined interpola- 
tor/resampler. In this case, the filter needs 
new coefficients for each scale factor — as the 
scale factor is changed, the quality of the 
image may vary. In addition, the overall 
response is poor if linear interpolation is used. 
However, the filter coefficients are time-invari- 
ant and there are no gain problems. 

A second approach is a combined filter/ 
interpolator followed by a resampler. Gener- 
ally, the higher the order of interpolation, n, 
the better the overall response. The center of 
the filter transfer function is always aligned 
over the new output sample. With each scaling 
factor, the filter transfer function is stretched 
or compressed to remain aligned over n output 
samples. Thus, the filter coefficients, and the 
number of input samples used, change with 
each new output sample and scaling factor. 
Dynamic gain normalization is required to 
ensure the sum of the filter coefficients is 
always equal to one. 

A third approach is an interpolator fol- 
lowed by a combined filter/resampler. The 
input data is interpolated up to a common mul- 
tiple of the input and output rates by the inser- 
tion of zero samples. This is filtered with a low- 
pass finite-impulse-response (FIR) filter to 



interpolate samples in the zero-filled gaps, then 
re-sampled at the required locations. This type 
of design is usually achieved with a 
“polyphase” filter which switches its coeffi- 
cients as the relative position of input and out- 
put samples change. 

Display Scaling Examples 

Figures 7.30 through 7.38 illustrate various 
scaling examples for displaying 16:9 and 4:3 
pictures on 4:3 and 16:9 displays, respectively. 

How content is displayed is a combination 
of user preferences and content aspect ratio. 
For example, when displaying 16:9 content on 
a 4:3 display, many users prefer to have the 
entire display filled with the cropped picture 
(Figure 7.31) rather than seeing black or gray 
bars with the letterbox solution (Figure 7.32) . 

In addition, some displays incorrectly 
assume any progressive video signal on their 
YPbPr inputs is from an “anamorphic” source. 
As a result, they horizontally upscale progres- 
sive 16:9 programs by 25% when no scaling 
should be applied. Therefore, for set-top boxes 
it is useful to include a “16:9 (Compressed)” 
mode, which horizontally downscales the pro- 
gressive 16:9 program by 25% to pre-compen- 
sate for the horizontally upscaling being done 
by the 16:9 display. 



Scan Rate Conversion 

In many cases, some form of scan rate con- 
version (also called temporal rate conversion, 
frame rate conversion, or field rate conversion) 
is needed. Multi-standard analog VCRs and 
scan converters use scan rate conversion to 
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1920 Samples 



Figure 7.31. Scaling 16:9 Content for a 4:3 Display: “Normal” or pan-and-scan 
mode. Results in some of the 16:9 content being ignored (indicated by gray 
regions). 
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720 Samples 




Figure 7.32. Scaling 16:9 Content for a 4:3 Display: “Letterbox” mode. 
Entire 16:9 program visible, with black bars at top and bottom of display. 
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Figure 7.33. Scaling 16:9 Content for a 4:3 Display: “Squeezed” mode. 
Entire 16:9 program horizontally squeezed to fit 4:3 display, resulting in a 
distorted picture. 





230 Chapter 7: Digital Video Processing 



720 Samples 

◄ ► 




Figure 7.34. 4:3 Source Example. 



1920 Samples 




Figure 7.35. Scaling 4:3 Content for a 16:9 Display: “Normal” mode. Left and 
right portions of 16:9 display not used, so made black or gray. 
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1920 Samples 




Figure 7.36. Scaling 4:3 Content for a 16:9 Display: “Wide” mode. Entire 
picture linearly scaled horizontally to fill 16:9 display, resulting in distorted 
picture unless used with anamorphic content. 



1920 Samples 




Figure 7.37. Scaling 4:3 Content for a 16:9 Display: “Zoom” mode. 
Top and bottom portion of 4:3 picture deleted, then scaled to fill 16:9 
display. 
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1920 Samples 

◄ ► 




Figure 7.38. Scaling 4:3 Content for a 16:9 Display: “Panorama” 
mode. Left and right 25% edges of picture are nonlinearly scaled 
horizontally to fill 16:9 display, distorted picture on left and right 
sides. 



convert between various video standards. 
Computers usually operate the display at about 
75 Hz noninterlaced, yet need to display 50 and 
60 Hz interlaced video. With digital television, 
multiple frame rates can be supported. 

Note that processing must be performed 
on component video signals (such as R G B' or 
YCbCr). Composite color video signals cannot 
be processed directly due to the color subcar- 
rier phase information present, which would 
be meaningless after processing. 

Frame or Field Dropping and 
Duplicating 

Simple scan-rate conversion may be done 
by dropping or duplicating one out of every N 
fields. For example, the conversion of 60 Hz to 
50 Hz interlaced operation may drop one out of 
every six fields, as shown in Figure 7.39, using 
a single field store. 



The disadvantage of this technique is that 
the viewer may see jerky motion, or motion 
judder. In addition, some video decompression 
products use top-field only to convert from 60 
Hz to 50 Hz, degrading the vertical resolution. 

The worst artifacts are present when a 
non-integer scan rate conversion is done — for 
example, when some frames are displayed 
three times, while others are displayed twice. 
In this instance, the viewer will observe double 
or blurred objects. As the human brain tracks 
an object in successive frames, it expects to 
see a regular sequence of positions, and has 
trouble reconciling the apparent stop-start 
motion of objects. As a result, it incorrectly 
concludes that there are two objects moving in 
parallel. 
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Figure 7.39. 60 Hz to 50 Hz Conversion Using a Single Field Store by 
Dropping One out of Every Six Fields. 
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Figure 7.40. 50 Hz to 60 Hz Conversion Using Temporal 
Interpolation with No Motion Compensation. 
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Temporal Interpolation 

This technique generates new frames from 
the original frames as needed to generate the 
desired frame rate. Information from both past 
and future input frames should be used to opti- 
mally handle objects appearing and disappear- 
ing. 

Conversion of 50 Hz to 60 Hz operation 
using temporal interpolation is illustrated in 
Figure 7.40. For every five fields of 50 Hz 
video, there are six fields of 60 Hz video. 

After both sources are aligned, two adja- 
cent 50 Hz fields are mixed together to gener- 
ate a new 60 Hz field. This technique is used in 
some inexpensive standards converters to con- 
vert between 50 Hz and 60 Hz standards. Note 
that no motion analysis is done. Therefore, if 
the camera operating at 50 Hz pans horizon- 
tally past a narrow vertical object, you see one 
object once every six 60 Hz fields, and for the 
five fields in between, you see two objects, one 
fading in while the other fades out. 

50 Hz to 60 Hz Examples 

Figure 7.41 illustrates a scan rate con- 
verter that implements vertical, followed by 
temporal, interpolation. Figure 7.42 illustrates 
the spectral representation of the design in 
Figure 7.41. 

Many designs now combine the vertical 
and temporal interpolation into a single design, 
as shown in Figure 7.43, with the correspond- 
ing spectral representation shown in Figure 
7.44. This example uses vertical, followed by 
temporal, interpolation. If temporal, followed 
by vertical, interpolation were implemented, 
the field stores would be half the size. How- 
ever, the number of line stores would increase 
from four to eight. 

In either case, the first interpolation pro- 
cess must produce an intermediate, higher- 



resolution progressive format to avoid inter- 
lace components that would interfere with the 
second interpolation process. It is insufficient 
to interpolate, either vertically or temporally, 
using a mixture of lines from both fields, due 
to the interpolation process not being able to 
compensate for the temporal offset of inter- 
laced lines. 

Motion Compensation 

Higher-quality scan rate converters using 
temporal interpolation incorporate motion 
compensation to minimize motion artifacts. 
This results in extremely smooth and natural 
motion, and images appear sharper and do not 
suffer from motion judder. 

Motion estimation for scan rate conversion 
differs from that used by MPEG. In MPEG, the 
goal is to minimize the displaced frame differ- 
ence (error) by searching for a high correla- 
tion between areas in subsequent frames. The 
resulting motion vectors do not necessarily 
correspond to true motion vectors. 

For scan rate conversion, it is important to 
determine true motion information to perform 
correct temporal interpolation. The interpola- 
tion should be tolerant of incorrect motion vec- 
tors to avoid introducing artifacts as 
unpleasant as those the technique is attempt- 
ing to remove. Motion vectors could be incor- 
rect for several reasons, such as insufficient 
time to track the motion, out-of-range motion 
vectors, and estimation difficulties due to alias- 
ing. 

100 Hz Interlaced Television Example 

A standard 50 Hz interlaced television 
shows 50 fields per second. The images flicker, 
especially when you look at large areas of 
highly saturated color. A much improved pic- 
ture can be achieved using a 100 Hz interlaced 
frame rate (also called double scan) . 
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Figure 7.42. Spectral Representation of Vertical, Followed by Temporal, Interpolation. (A) Vertical 
lowpass filtering. (B) Resampling to intermediate sequential format and temporal lowpass 
filtering. (C) Resampling to final standard. 
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Figure 7.44. Spectral Representation of Combined Vertical and Temporal Interpolation. 
(A) Two-dimensional lowpass filtering. (B) Resampling to final standard. 
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Figure 7.45. 50 Hz to 100 Hz (Double Scan Interlaced) Techniques. 
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Early 100 Hz televisions simply repeated 
fields (F jF }F 2 F 2 F 3 F 3 F 4 F 4 . . .) » as shown in Fig- 
ure 7.45a. However, they still had line flicker, 
where horizontal lines constantly jumped 
between the odd and even lines. This distur- 
bance occurred once every twenty-fifth of a 
second. 

The field sequence F 1 F 2 F 1 F 2 F 3 F 4 F 3 F 4 ... 
can be used, which solves the line flicker prob- 
lem. Unfortunately, this gives rise to the prob- 
lem of judder in moving images. This can be 
compensated for by using the 
F 1 F 2 F 4 F 2 F 3 F 4 F 3 F 4 ... sequence for static 
images, and the F 4 F 4 F 2 F 2 F 3 F 3 F 4 F 4 ... sequence 
for moving images. 

An ideal picture is still not obtained when 
viewing programs created for film. They are 
subject to judder, owing to the fact that each 
film frame is transmitted twice. Instead of the 
field sequence F 4 F 4 F 2 F 2 F 3 F 3 F 4 F 4 . . . , the 
situation calls for the sequence 
F 1 FrF 2 F 2 .F 3 F 3 .F 4 F 4 ... (Figure 7.45b), where 
F n - is a motion-compensated generated 
image between F n and F n+1- 



movie by ~5 minutes. Some audio decoders 
cannot handle the 4% faster audio data via S/ 
PDIF (IEC 60958). 

To compensate the audio changing pitch 
due to the telecine speedup, it may be resam- 
pled during decoding to restore the original 
pitch (costly to do in a low-cost consumer prod- 
uct) or resampling may be done during the 
program authoring. 

3:2 Pulldown 

When converting 24 frames per second 
content to 60 Hz, 3:2 pulldown is commonly 
used, as shown in Figure 7.46. During com- 
pression, the film speed is slowed down by 
0.1% to 23.976 (24/1.001) frames per second 
since 59.94 Hz is used for NTSC timing com- 
patibility. During decompression, 2 film frames 
generate 5 video fields (resulting in 480i30 or 
1080i30 video) or 5 video frames (resulting in 
480p60, 720p60, or 1080p60 video) . 

FILM VIDEO WHITE 

FRAME FIELD / FRAME FLAG 



2:2 Pulldown 

This technique is used with some film- 
based compressed content for 50 Hz regions. 
Film is usually recorded at 24 frames per sec- 
ond. 

During compression, the telecine machine 
is sped up from 24 to 25 frames per second, 
making the content 25 frames per second pro- 
gressive. During decompression, each film 
frame is simply mapped into two video fields 
(resulting in 576i25 or 1080i25 video) or two 
video frames (resulting in 576p50, 720p50, or 
1080p50 video). 

This technique provides higher video qual- 
ity and avoids motion judder artifacts. How- 
ever, it shortens the duration of the program 
by about 4%, cutting the duration of a 2-hour 
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w 



w 



O = ODD LINES (IF INTERLACED OUTPUT) 
E = EVEN LINES (IF INTERLACED OUTPUT) 



Figure 7.46. 3:2 Pulldown for Converting 
24 Hz Film to 60 Hz Video. 
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In scenes of high-speed motion of objects, 
the specific film frame used for a particular 
video field or frame may be manually adjusted 
to minimize motion artifacts. 

3:2 pulldown may also be used during 
video decompression to simply to increase the 
frame rate from 23.976 (24/1.001) to 59.94 (60/ 
1.001) frames per second, avoiding the deinter- 
lacing issue. 

Varispeed may be used to cover up prob- 
lems such as defects, splicing, censorship cuts, 
or to change the running time of a program. 
Rather than repeating film frames and causing 
a stutter, the 3:2 relationship between the film 
and video is disrupted long enough to ensure a 
smooth temporal rate. 

Analog laserdiscs used a white flag signal 
to indicate the start of another sequence of 
related fields for optimum still-frame perfor- 
mance. During still-frame mode, the white flag 
signal tells the system to back up two fields (to 
use two fields that have no motion between 
them) to re-display the current frame. 

3:3 Pulldown 

This technique is used in some displays 
that support 72 Hz frame rate. The 24 frames 
per second film-based content is converted to 
72 Hz progressive by simply duplicating each 
film frame three times. 

24:1 Pulldown 

This technique, also called 12:1 pulldown, 
can be used to convert 24 frames/ second con- 
tent to 50 fields per second. 

Two video fields are generated from every 
film frame, except every 12th film frame gener- 
ates 3 video fields. Although the audio pitch is 
correct, motion judder is present every one- 
half second when smooth motion is present. 



Noninterlaced-to-Interlaced 

Conversion 

In some applications, it is necessary to dis- 
play a noninterlaced video signal on an inter- 
laced display. Thus, some form of 
noninterlaced-to-interlaced conversion may be 
required. 

Noninterlaced-to-interlaced conversion 
must be performed on component video sig- 
nals (such as R G B' or YCbCr). Composite 
color video signals (such as NTSC or PAL) 
cannot be processed directly due to the pres- 
ence of color subcarrier phase information, 
which would be meaningless after processing. 
These signals must be decoded into compo- 
nent color signals, such as R G B' or YCbCr, 
prior to conversion. 

There are essentially two techniques: scan 
line decimation and vertical filtering. 

Scan Line Decimation 

The easiest approach is to throw away 
every other active scan line in each noninter- 
laced frame, as shown in Figure 7.47. Although 
the cost is minimal, there are problems with 
this approach, especially with the top and bot- 
tom of objects. 

If there is a sharp vertical transition of 
color or intensity, it will flicker at one-half the 
frame rate. The reason is that it is only dis- 
played every other field as a result of the deci- 
mation. For example, a horizontal line that is 
one noninterlaced scan line wide will flicker on 
and off. Horizontal lines that are two noninter- 
laced scan lines wide will oscillate up and 
down. 

Simple decimation may also add aliasing 
artifacts. While not necessarily visible, they 
will affect any future processing of the picture. 
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Figure 7.47. Noninterlaced-to-lnterlaced Conversion Using Scan Line Decimation. 
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Figure 7.48. Noninterlaced-to-lnterlaced Conversion Using 3-Line Vertical Filtering, 
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Vertical Filtering 

A better solution is to use two or more 
lines of noninterlaced data to generate one line 
of interlaced data. Fast vertical transitions are 
smoothed out over several interlaced lines. 

For a 3-line filter, such as shown in Figure 
7.48, typical coefficients are [0.25, 0.5, 0.25]. 
Using more than three lines usually results in 
excessive blurring, making small text difficult 
to read. 

An alternate implementation uses HR 
rather than FIR filtering. In addition to averag- 
ing, this technique produces a reduction in 
brightness around objects, further reducing 
flicker. 

Note that care must be taken at the begin- 
ning and end of each frame in the event that 
fewer scan lines are available for filtering. 

Interlaced-to-Noninterlaced 

Conversion 

In some applications, it is necessary to dis- 
play an interlaced video signal on a noninter- 
laced display. Thus, some form of deinterlacing 
or progressive scan conversion may be required. 

Note that deinterlacing must be performed 
on component video signals (such as R G B' or 
YCbCr). Composite color video signals (such 
as NTSC or PAL) cannot be deinterlaced 
directly due to the presence of color subcarrier 
phase information, which would be meaning- 
less after processing. These signals must be 
decoded into component color signals, such as 
R G B' or YCbCr, prior to deinterlacing. 

There are two fundamental deinterlacing 
algorithms: video mode and film mode. Video 
mode deinterlacing can be further broken 
down into inter-field and intra-field processing. 



The goal of a good deinterlacer is to correctly 
choose the best algorithm needed at a particu- 
lar moment. 

In systems where the vertical resolution of 
the source and display do not match (due to, 
for example, displaying SDTV content on an 
HDTV), the deinterlacing and vertical scaling 
can be merged into a single process. 

Video Mode: Intra-Field Processing 

This is the simplest method for generating 
additional scan lines using only information in 
the original field. The computer industry has 
coined this technique as bob. 

Although there are two common tech- 
niques for implementing intra-field processing, 
scan line duplication and scan line interpola- 
tion, the resulting vertical resolution is always 
limited by the content of the original field. 

Scan line Duplication 

Scan line duplication (Figure 7.49) simply 
duplicates the previous active scan line. 
Although the number of active scan lines is 
doubled, there is no increase in the vertical 
resolution. 

Scan line Interpolation 

Scan line interpolation generates interpo- 
lated scan lines between the original active 
scan lines. Although the number of active scan 
lines is doubled, the vertical resolution is not. 

The simplest implementation, shown in 
Figure 7.50, uses linear interpolation to gener- 
ate a new scan line between two input scan 
lines: 

outn = (in n _] + in n+1 ) / 2 

Better results, at additional cost, may be 
achieved by using a FIR filter: 
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Figure 7.49. Deinterlacing Using Scan Line 
Duplication. New scan lines are generated by 
duplicating the active scan line above it. 
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Figure 7.51. Deinterlacing Using Field 
Merging. Shaded scan lines are generated by 
using the input scan line from the next or 
previous field. 
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Figure 7.50. Deinterlacing Using Scan Line 
Interpolation. New scan lines are generated 
by averaging the previous and next active 
scan lines. 
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Figure 7.52. Producing Deinterlaced 
Frames at Field Rates. 
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out n = (160*(in n _ 1 + in n+1 ) 

- 48* (in n _ 3 + in n+3 ) 

+ 24* (in n _ 5 + in n+5 ) 

- 12* (in n _7 + in n+7 ) 

+ 6* (in n _9 + in n+9 ) 

- 2*(in n _ n + in n+11 ) 

Fractional Ratio Interpolation 

In many cases, there is a periodic, but non- 
integral, relationship between the number of 
input scan lines and the number of output scan 
lines. In this case, fractional ratio interpolation 
may be necessary, similar to the polyphase fil- 
tering used for scaling only performed in the 
vertical direction. This technique combines 
deinterlacing and vertical scaling into a single 
process. 

Variable Interpolation 

In a few cases, there is no periodicity in the 
relationship between the number of input and 
output scan lines. Therefore, in theory, an infi- 

OBJECT POSITION 
IN FIELD ONE 



nite number of filter phases and coefficients 
are required. Since this is not feasible, the 
solution is to use a large, but finite, number of 
filter phases. The number of filter phases 
determines the interpolation accuracy. This 
technique also combines deinterlacing and ver- 
tical scaling into a single process. 

Video Mode: Inter-Field Processing 

In this method, video information from 
more than one field is used to generate a single 
progressive frame. This method can provide 
higher vertical resolution since it uses content 
from more than a single field. 

Field Merging 

This technique merges two consecutive 
fields together to produce a frame of video 
(Figure 7.51). At each field time, the active 
scan lines of that field are merged with the 
active scan lines of the previous field. The 

OBJECT POSITION 
IN FIELD TWO 




OBJECT POSITIONS 
IN MERGED FIELDS 




Figure 7.53. Movement Artifacts When Field 
Merging Is Used. 
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result is that for each input field time, a pair of 
fields combine to generate a frame (see Figure 

7.52) . Although simple to implement, the verti- 
cal resolution is doubled only in regions of no 
movement. 

Moving objects will have artifacts, also 
called combing , due to the time difference 
between two fields — a moving object is located 
in a different position from one field to the 
next. When the two fields are merged, moving 
objects will have a double image (see Figure 

7.53) . 

It is common to soften the image slightly in 
the vertical direction to attempt to reduce the 
visibility of combing. When implemented, it 
causes a loss of vertical resolution and jitter on 
movement and pans. 

The computer industry refers to this tech- 
nique as weave, but weave also includes the 
inverse telecine process to remove any 3:2 pull- 
down present in the source. Theoretically, this 
eliminates the double image artifacts since two 
identical fields are now being merged. 

Motion Adaptive Deinterlacing 

A good deinterlacing solution is to use field 
merging for still areas of the picture and scan 
line interpolation for areas of movement. To 
accomplish this, motion, on a sample-by-sam- 
ple basis, must be detected over the entire pic- 
ture in real time, requiring processing several 
fields of video. 

As two fields are combined, full vertical 
resolution is maintained in still areas of the pic- 
ture, where the eye is most sensitive to detail. 
The sample differences may have any value, 
from 0 (no movement and noise-free) to maxi- 
mum (for example, a change from full intensity 
to black) . A choice must be made when to use 
a sample from the previous field (which is in 
the wrong location due to motion) or to inter- 
polate a new sample from adjacent scan lines in 
the current field. Sudden switching between 



methods is visible, so crossfading (also called 
soft switching) is used. At some magnitude of 
sample difference, the loss of resolution due to 
a double image is equal to the loss of resolu- 
tion due to interpolation. That amount of 
motion should result in the crossfader being at 
the 50% point. Less motion will result in a fade 
towards field merging and more motion in a 
fade towards the interpolated values. 

Rather than “per pixel” motion adaptive 
deinterlacing, which makes decisions for every 
sample, some low-cost solutions use “per field” 
motion adaptive deinterlacing. In this case, the 
algorithm is selected each field, based on the 
amount of motion between the fields. “Per 
pixel” motion adaptive deinterlacing, although 
difficult to implement, looks quite good when 
properly done. “Per field” motion adaptive 
deinterlacing rarely looks much better than 
vertical interpolation. 

Motion-Compensated Deinterlacing 

Motion-compensated (or motion vector 
steered) deinterlacing is several orders of mag- 
nitude more complex than motion adaptive 
deinterlacing, and is commonly found in pro- 
video format converters. 

Motion-compensated processing requires 
calculating motion vectors between fields for 
each sample, and interpolating along each 
sample’s motion trajectory. Motion vectors 
must also be found that pass through each of 
any missing samples. Areas of the picture may 
be covered or uncovered as you move between 
frames. The motion vectors must also have 
sub-pixel accuracy, and be determined in two 
temporal directions between frames. 

The motion vector errors used by MPEG 
are self-correcting since the residual differ- 
ence between the predicted macroblocks is 
encoded. As motion-compensated deinterlac- 
ing is a single-ended system, motion vector 
errors will produce artifacts, so different 
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search and verification algorithms must be 
used. 

Film Mode (Using Inverse Telecine) 

For sources that have 3:2 pulldown (i.e., 60 
fields/second video converted from 24 
frames/ second film) , higher deinterlacing per- 
formance may be obtained by removing dupli- 
cate fields prior to processing. 

The inverse telecine process detects the 
3:2 field sequence and the redundant third 
fields are removed. The remaining field pairs 
are merged (since there is no motion between 
them) to form progressive frames at 24 
frames/second. These are then repeated in a 
3:2 sequence to get to 60 frames/second. 

Although this may seem to be the ideal 
solution, some content uses both 60 fields/ sec- 
ond video and 24 frames/ second video (film- 
based) within a program. In addition, some 
content may occasionally have both video 
types present simultaneously. In other cases, 
the 3:2 pulldown timing (cadence) doesn’t stay 
regular, or the source was never originally 
from film. Thus, the deinterlacer has to detect 
each video type and process it differently 
(video mode vs. film mode). Display artifacts 
are common due to the delay between the 
video type changing and the deinterlacer 
detecting the change. 

Frequency Response Considerations 

Various two-times vertical upsampling 
techniques for deinterlacing may be imple- 
mented by stuffing zero values between two 
valid lines and filtering, as shown in Figure 
7.54. 



Line A shows the frequency response for 
line duplication, in which the lowpass filter 
coefficients for the filter shown are 1,1, and 0. 

Line interpolation, using lowpass filter 
coefficients of 0.5, 1.0, and 0.5, results in the 
frequency response curve of Line B. Note that 
line duplication results in a better high-fre- 
quency response. Vertical filters with a better 
frequency response than the one for line dupli- 
cation are possible, at the cost of more line 
stores and processing. 



GAIN 




T . t . t 




Figure 7.54. Frequency Response of Various 
Deinterlacing Filters. (A) Line duplication. (B) Line 
interpolation. (C) Field merging. 
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The best vertical frequency response is 
obtained when field merging is implemented. 
The spatial position of the lines is already cor- 
rect and no vertical processing is required, 
resulting in a flat curve (Line C). Again, this 
applies only for stationary areas of the image. 



DCT-Based Compression 

The transform process of many video com- 
pression standards is based on the Discrete 
Cosine Transform, or DCT. The easiest way to 
envision it is as a filter bank with all the filters 
computed in parallel. 

During encoding, the DCT is usually fol- 
lowed by several other operations, such as 
quantization, zig-zag scanning, run-length 
encoding, and variable-length encoding. Dur- 
ing decoding, this process flow is reversed. 

Many times, the terms macroblocks and 
blocks are used when discussing video com- 
pression. Figure 7.55 illustrates the relation- 
ship between these two terms, and shows why 
transform processing is usually done on 8 x 8 
samples. 

DCT 

The 8x8 DCT processes an 8 x 8 block of 
samples to generate an 8 x 8 block of DCT 
coefficients, as shown in Figure 7.56. The input 
may be samples from an actual frame of video 
or motion-compensated difference (error) val- 
ues, depending on the encoder mode of opera- 
tion. Each DCT coefficient indicates the 
amount of a particular horizontal or vertical 
frequency within the block. 

DCT coefficient (0,0) is the DC coefficient, 
or average sample value. Since natural images 
tend to vary only slightly from sample to sam- 
ple, low frequency coefficients are typically 



larger values and high frequency coefficients 
are typically smaller values. 

The 8x8 DCT is defined in Figure 7.57. 
f(x, y) denotes sample (x, y) of the 8x8 input 
block and F(u,v) denotes coefficient (u, v) of 
the DCT transformed block. 

A reconstructed 8x8 block of samples is 
generated using an 8 x 8 inverse DCT (IDCT), 
defined in Figure 7.58. Although exact recon- 
struction is theoretically achievable, it is not 
practical due to finite-precision arithmetic, 
quantization and differing IDCT implementa- 
tions. As a result, there are mismatches 
between different IDCT implementations. 

Mismatch control attempts to reduce the 
drift between encoder and decoder IDCT 
results by eliminating bit patterns having the 
greatest contribution towards mismatches. 

MPEG-1 mismatch control is known as 
“oddification” since it forces all quantized DCT 
coefficients to negative values. MPEG-2 and 
MPEG-4.2 use an improved method called 
“LSB toggling” which affects only the LSB of 
the 63rd DCT coefficient after inverse quanti- 
zation. 

H.264 (also known as MPEG-4.10) neatly 
sidesteps the issue by using an “exact-match 
inverse transform.” Every decoder will pro- 
duce exactly the same pictures, all else being 
equal. 

Quantization 

The 8x8 block of DCT coefficients is 
quantized, which reduces the overall precision 
of the integer coefficients and tends to elimi- 
nate high frequency coefficients, while main- 
taining perceptual quality. Higher frequencies 
are usually quantized more coarsely (fewer val- 
ues allowed) than lower frequencies, due to 
visual perception of quantization error. The 
quantizer is also used for constant bit-rate 
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DIVIDE PICTURE 
INTO 16 X 16 BLOCKS 
(MACROBLOCKS) 




EACH MACROBLOCK IS 
16 SAMPLES BY 16 LINES 
(4 BLOCKS) 



EACH BLOCK IS 8 
SAMPLES BY 8 LINES 



Figure 7.55. The Relationship between Macroblocks and Blocks. 
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Figure 7.56. The DCT Processes the 8x8 Block of Samples or Error Terms to Generate 
an 8 x 8 Block of DCT Coefficients. 
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7 7 

F(u,v ) = 0.25C(*<)C(v) Y. Y /(*, tOcos (((2x + l)«7i)/16)cos(((2y + l)v7i)/16) 

x= 0y = 0 

u, v, X, y = 0, 1, 2, .... 7 

(x, y) are spatial coordinates in the sample domain 
(u, v) are coordinates in the transform domain 

Figure 7.57. 8x8 Two-Dimensional DCT Definition. 



7 7 

f(x,y) = 0.25 Y Y C(u)C(v)F(u, v)cos(((2x + l)z/ji)/16)cos(((2y + 1)vji)/16) 

u = 0 v = 0 



Figure 7.58. 8x8 Two-Dimensional Inverse DCT (IDCT) Definition. 



applications where it is varied to control the 
output bit-rate. 

Zig-Zag Scanning 

The quantized DCT coefficients are re- 
arranged into a linear stream by scanning 
them in a zig-zag order. This rearrangement 
places the DC coefficient first, followed by fre- 
quency coefficients arranged in order of 
increasing frequency, as shown in Figures 
7.59, 7.60, and 7.61. This produces long runs of 
zero coefficients. 

Run Length Coding 

The linear stream of quantized frequency 
coefficients is converted into a series of [run, 
amplitude] pairs, [run] indicates the number 
of zero coefficients, and [amplitude] the non- 
zero coefficient that ended the run. 



Variable-Length Coding 

The [run, amplitude] pairs are coded using 
a variable-length code, resulting in additional 
lossless compression. This produces shorter 
codes for common pairs and longer codes for 
less common pairs. 

This coding method produces a more com- 
pact representation of the DCT coefficients, as 
a large number of DCT coefficients are usually 
quantized to zero and the re-ordering results 
(ideally) in the grouping of long runs of con- 
secutive zero values. 




DCT-Based Compression 251 



0 


1 


5 


6 


14 


15 


27 


28 


2 


4 


7 


13 


16 


26 


29 


42 


3 


8 


12 


17 


25 


30 


41 


43 


9 


11 


18 


24 


31 


40 


44 


53 


10 


19 


23 


32 


39 


45 


52 


54 


20 


22 


33 


38 


46 


51 


55 


60 


21 


34 


37 


47 


50 


56 


59 


61 


35 


36 


48 


49 


57 


58 


62 


63 



A 




LINEAR ARRAY 
OF 64 FREQUENCY 
COEFFICIENTS 



ZIG-ZAG SCAN OF 
8X8 BLOCK OF 
QUANTIZED 
FREQUENCY 
COEFFICIENTS 



Figure 7.59. The 8x8 Block of Quantized DOT Coefficients Are Zig-Zag Scanned to 
Arrange in Order of Increasing Frequency. This scanning order is used for H.261, H.263, 
MPEG-1, MPEG-2, MPEG-4.2, ITU-R BT.1618, ITU-R BT.1620, SMPTE 314M, and SMPTE 
370M. 
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Figure 7.60. H.263, MPEG-2, and MPEG-4.2 “Alternate-Vertical” Scanning Order. 
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Figure 7.61. H.263 and MPEG-4.2 “Alternate-Horizontal” Scanning Order. 



Fixed Pixel Display 
Considerations 

The unique designs and color reproduc- 
tion gamuts of fixed pixel displays have 
resulted in new video processing technologies 
being developed. The result is brighter, 
sharper, more colorful images regardless of 
the video source. 

Expanded Color Reproduction 

Broadcast stations are usually tuned to 
meet the limited color reproduction character- 
istics of CRT-based televisions. To fit the color 
reproduction capabilities of PDP and LCD, 
manufacturers have introduced various color 
expansion technologies. These include using 
independent hue and saturation controls for 
each primary and complementary color, plus 
the flesh color. 



Detail Correction 

In CRT-based televisions, enhancing the 
image is commonly done by altering the elec- 
tron beam diameter. With fixed-pixel displays, 
adding overshoot and undershoot to the video 
signals causes distortion. An acceptable imple- 
mentation is to gradually change the bright- 
ness of the images before and after regions 
needing contour enhancement. 

Non-Uniform Quantization 

Rather than simply increasing the number 
of quantization levels, the quantization steps 
can be changed in accordance with the inten- 
sity of the image. This is possible since people 
better detect small changes in brightness for 
dark images than for bright images. In addi- 
tion, the brighter the image, the less sensitive 
people are to changes in brightness. This 
means that more quantization steps can be 
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used for dark images than for bright ones. This 
technique can also be used to increase the 
quantization steps for shades that appear fre- 
quently. 

Scaling and Deinterlacing 

Fixed-pixel displays, such as LCD and 
plasma, usually upscale then downscale during 
deinterlacing to minimize moire noise due to 
folded distortion. For example, a 10801 source 
is deinterlaced to 2160p, scaled to 1536p, then 
finally scaled to 768p (to drive a 1024 x 768 dis- 
play). Alternately, some solutions deinterlace 
and upscale to 1500p, then scale to the dis- 
play's native resolution. 



Application Example 

Figures 7.62 and 7.63 illustrate the typical 
video processing done after video decompres- 
sion and deinterlacing. 

In addition to the primary video source, 
additional video sources typically include an 
on-screen-display (OSD), content navigation 
graphics, closed captioning or subtitles, and a 
second video for picture-in-picture (PIP) . 

The OSD plane displays configuration 
menus for the box, such as video output format 
and resolution, audio output format, etc. OSD 
design is unique to each product, so the OSD 
plane usually supports a wide variety of RGB/ 
YCbCr formats and resolutions. Lookup tables 
can gamma-correct linear RGB data, convert 2- 
, 4-, or 8-indexed color to 32-bit YCbCrA data, 
or translate 0-255 graphics levels to the 16-235 
video levels. 

The content navigation plane displays 
graphics generated by Blu-ray BD-J, HD DVD 
HDi, electronic program guides, etc. It should 
support the same formats and capabilities as 
the OSD plane. 



The subtitle plane is a useful region for ren- 
dering closed captioning, DVB subtitles, DVD 
sub pictures., etc. Lookup tables convert 2-, 4-, 
or 8-indexed color to 32-bit YCbCrA data. 

The secondary video plane is usually used 
to support a second video source for picture-in- 
picture (PIP) or graphics (such as JPEG 
images). For graphics data, lookup tables can 
gamma-correct linear RGB data, convert 2-, 4-, 
or 8-indexed color to 32-bit YCbCrA data, or 
translate 0-255 graphics levels to the 16-235 
video levels. 

Being able to scale each source indepen- 
dently offers maximum flexibility. In addition 
to being able to output any resolution regard- 
less of the source resolutions, special effects 
can also be accommodated. 

Chromaticity correction ensure colors are 
accurate independent of the sources and dis- 
play (SDTVvs. HDTV). 

Independent brightness, contrast, satura- 
tion, hue, and sharpness controls for each 
source and video output interface offers the 
most flexibility. For example, PIP can be 
adjusted without affecting the main picture, 
video can be adjusted without affecting still pic- 
ture video quality, etc. 

The optional downscaling and progressive- 
to-interlaced conversion block for the top 
NTSC/PAL encoder in Figure 7.63 enables 
simultaneous HD and SD outputs, or simulta- 
neous progressive and interlaced outputs, 
without affecting the HD or progressive video 
quality. 

The second NTSC/PAL encoder shown at 
the bottom of Figure 7.63 is useful for record- 
ing a program without any OSD or subtitle 
information being accidently recorded. 
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Chapter 8 



NTSC, PAL, and 
SECAM Overview 



To fully understand the NTSC, PAL, and 
SECAM encoding and decoding processes, it 
is helpful to review the background of these 
standards and how they came about. 



NTSC Overview 

The first color television system was devel- 
oped in the United States, and on December 
17, 1953, the Federal Communications Com- 
mission (FCC) approved the transmission 
standard, with broadcasting approved to begin 
January 23, 1954. Most of the work for develop- 
ing a color transmission standard that was 
compatible with the (then current) 525-line, 60- 
field-per-second, 2:1 interlaced monochrome 
standard was done by the National Television 
System Committee (NTSC) . 

Luminance Information 

The monochrome luminance (Y) signal is 
derived from gamma-corrected red, green, and 
blue (R G B ) signals: 

Y = 0.299R' + 0.587G' + 0.114B' 



Due to the sound subcarrier at 4.5 MHz, a 
requirement was made that the color signal fit 
within the same bandwidth as the mono- 
chrome video signal (0-4.2 MHz) . 

For economic reasons, another require- 
ment was made that monochrome receivers 
must be able to display the black and white 
portion of a color broadcast and that color 
receivers must be able to display a mono- 
chrome broadcast. 

Color Information 

The eye is most sensitive to spatial and 
temporal variations in luminance; therefore, 
luminance information was still allowed the 
entire bandwidth available (0-4.2 MHz) . Color 
information, to which the eye is less sensitive 
and which therefore requires less bandwidth, 
is represented as hue and saturation informa- 
tion. 

The hue and saturation information is 
transmitted using a 3.58 MHz subcarrier, 
encoded so that the receiver can separate the 
hue, saturation, and luminance information 
and convert them back to RGB signals for dis- 
play. Although this allows the transmission of 
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color signals within the same bandwidth as 
monochrome signals, the problem still 
remains as to how to separate the color and 
luminance information cost-effectively, since 
they occupy the same portion of the frequency 
spectrum. 

To transmit color information, U and V or I 
and Q “color difference” signals are used: 

R' - Y = 0.701R' - 0.587G' - 0.1MB' 

B' - Y = -0.299R' - 0.587G' + 0.886B' 

U = 0.492 (B'-Y) 

V= 0.877 (R'-Y) 

I = 0.596R' - 0.275G' - 0.321B' 

= Vcos 33° - Usin 33° 

= 0.736 (R'-Y)- 0.268 (B'-Y) 

Q = 0.212R' - 0.523G' + 0.311B' 

= Vsin 33° + Ucos 33° 

= 0.478 (R'-Y) +0.413 (B'-Y) 

The scaling factors to generate U and V 
from (B'-Y) and (R' - Y) were derived due to 
overmodulation considerations during trans- 
mission. If the full range of (B'-Y) and (R' - 
Y) were used, the modulated chrominance lev- 
els would exceed what the monochrome trans- 
mitters were capable of supporting. 
Experimentation determined that modulated 
subcarrier amplitudes of 20% of the Y signal 
amplitude could be permitted above white and 
below black. The scaling factors were then 
selected so that the maximum level of 75% 
color would be at the white level. 

I and Q were initially selected since they 
more closely related to the variation of color 
acuity than U and V. The color response of the 
eye decreases as the size of viewed objects 
decreases. Small objects, occupying frequen- 
cies of 1.3-2.0 MHz, provide little color sensa- 
tion. Medium objects, occupying the 0.6-1.3 
MHz frequency range, are acceptable if repro- 



duced along the orange-cyan axis. Larger 
objects, occupying the 0-0.6 MHz frequency 
range, require full three-color reproduction. 

The I and Q bandwidths were chosen 
accordingly, and the preferred color reproduc- 
tion axis was obtained by rotating the U and V 
axes by 33°. The Q component, representing 
the green-purple color axis, was band-limited 
to about 0.6 MHz. The I component, represent- 
ing the orange-cyan color axis, was band-lim- 
ited to about 1.3 MHz. 

Another advantage of limiting the I and Q 
bandwidths to 1.3 MHz and 0.6 MHz, respec- 
tively, is to minimize crosstalk due to asymmet- 
rical sidebands as a result of lowpass filtering 
the composite video signal to about 4.2 MHz. 
Q is a double sideband signal; however, I is 
asymmetrical, bringing up the possibility of 
crosstalk between I and Q. The symmetry of Q 
avoids crosstalk into I; since Q is bandwidth 
limited to 0.6 MHz, I crosstalk falls outside the 
Q bandwidth. 

U and Y, both bandwidth-limited to 1.3 
MHz, are now commonly used instead of I and 
Q. When broadcast, UV crosstalk occurs above 
0.6 MHz; however, this is not usually visible 
due to the limited UV bandwidths used by 
NTSC decoders for consumer equipment. 

The UV and IQ vector diagram is shown in 
Figure 8.1. 

Color Modulation 

I and Q (or U and V) are used to modulate 
a 3.58 MHz color subcarrier using two bal- 
anced modulators operating in phase quadra- 
ture: one modulator is driven by the subcarrier 
at sine phase; the other modulator is driven by 
the subcarrier at cosine phase. The outputs of 
the modulators are added together to form the 
modulated chrominance signal: 
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C = Q sin (cot + 33°) + I cos (cot + 33°) 
co = 27tF sc 

F sc = 3.579545 MHz (± 10 Hz) 
or, if U and V are used instead of I and Q: 

C = U sin cot + V cos cot 

Hue information is conveyed by the 
chrominance phase relative to the subcarrier. 
Saturation information is conveyed by chromi- 
nance amplitude. In addition, if an object has 
no color (such as a white, gray, or black 
object) , the subcarrier is suppressed. 



Composite Video Generation 

The modulated chrominance is added to 
the luminance information along with appropri- 
ate horizontal and vertical sync signals, blank- 
ing information, and color burst information, to 
generate the composite color video waveform 
shown in Figure 8.2. 

composite NTSC = Y + Q sin (cot + 33°) 

+ 1 cos (cot + 33°) + timing 

or, if U and V are used instead of I and Q: 

composite NTSC = Y + U sin cot 
+ V cos cot + timing 



IRE SCALE UNITS 




283 ° 



Figure 8.1. UV and IQ Vector Diagram for 75% Color Bars. 
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Figure 8.2. (M) NTSC Composite Video Signal for 75% Color Bars. 
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The bandwidth of the resulting composite 
video signal is shown in Figure 8.3. 

The I and Q (or U and V) information can 
be transmitted without loss of identity as long 
as the proper color subcarrier phase relation- 
ship is maintained at the encoding and decod- 
ing process. A color burst signal, consisting of 
nine cycles of the subcarrier frequency at a 
specific phase, follows most horizontal sync 
pulses, and provides the decoder a reference 
signal so as to be able to recover the I and Q 
(or U and V) signals properly. The color burst 
phase is defined to be along the -U axis as 
shown in Figure 8.1. 

Color Subcarrier Frequency 

The specific choice for the color subcarrier 
frequency was dictated by several factors. The 
first was the need to provide horizontal inter- 
lace to reduce the visibility of the subcarrier, 
requiring that the subcarrier frequency, Fso 
be an odd multiple of one-half the horizontal 
line rate. The second factor was selection of a 
frequency high enough that it generated a fine 
interference pattern having low visibility. 
Third, double sidebands for I and Q (or U and 
V) bandwidths below 0.6 MHz had to be 
allowed. 

The choice of the frequencies is: 

F h = (4.5 x 10 6 /286) Hz = 15,734.27 Hz 

F v = F H / (525/2) = 59.94 Hz 

F sc = ((13 x 7 x 5)/2) x F H = (455/2) x F H 
= 3.579545 MHz 

The resulting Fy (field) and F^ (line) rates 
were slightly different from the monochrome 
standards, but fell well within the tolerance 
ranges and were therefore acceptable. Figure 
8.4 illustrates the resulting spectral interleav- 
ing. 



The luminance (Y) components are modu- 
lated due to the horizontal blanking process, 
resulting in bunches of luminance information 
spaced at intervals of Fy. These signals are fur- 
ther modulated by the vertical blanking pro- 
cess, resulting in luminance frequency 
components occurring at NF H ± MFy. N has a 
maximum value of about 277 with a 4.2 MHz 
bandwidth-limited luminance. Thus, luminance 
information is limited to areas about integral 
harmonics of the line frequency (Fpj), with 
additional spectral lines offset from NF H by the 
29.97 Hz vertical frame rate. 

The area in the spectrum between lumi- 
nance groups, occurring at odd multiples of 
one-half the line frequency, contains minimal 
spectral energy and is therefore used for the 
transmission of chrominance information. The 
harmonics of the color subcarrier are sepa- 
rated from each other by Fy since they are odd 
multiples of one-half F H , providing a half-line 
offset and resulting in an interlace pattern that 
moves upward. Four complete fields are 
required to repeat a specific sample position, 
as shown in Figure 8.5. 

NTSC Standards 

Figure 8.6 shows the common designa- 
tions for NTSC systems. The letter M refers to 
the monochrome standard for line and field 
rates (525/59.94), a video bandwidth of 4.2 
MHz, an audio carrier frequency 4.5 MHz 
above the video carrier frequency, and an RF 
channel bandwidth of 6 MHz. NTSC refers to 
the technique to add color information to the 
monochrome signal. Detailed timing parame- 
ters can be found in Table 8.9. 

NTSC 4.43 is commonly used for multi- 
standard analog VCRs. The horizontal and ver- 
tical timing is the same as (M) NTSC; color 
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Figure 8.3. Video Bandwidths of Baseband (M) NTSC Video. (A) 
Using 1.3 MHz I and 0.6 MHz Q signals. (B) Using 1.3 MHz U and 
V signals. 
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Figure 8.6. Common NTSC Systems. 



encoding uses the PAL modulation format and 
a 4.43361875 MHz color subcarrier frequency. 

NTSC-J, used in Japan, is the same as (M) 
NTSC, except there is no blanking pedestal 
during active video. Thus, active video has a 
nominal amplitude of 714 mV. 

Noninterlaced NTSC is a 262-line, 60 
frames-per-second version of NTSC, as shown 
in Figure 8.7. This format is identical to stan- 
dard (M) NTSC, except that there are 262 lines 
per frame. 

RF Modulation 

Figures 8.8, 8.9, and 8.10 illustrate the 
basic process of converting baseband (M) 
NTSC composite video to an RF (radio fre- 
quency) signal. 

Figure 8.8a shows the frequency spectrum 
of a baseband composite video signal. It is sim- 
ilar to Figure 8.3. However, Figure 8.3 only 
shows the upper sideband for simplicity. The 
“video carrier” notation at 0 MHz serves only 



as a reference point for comparison with Fig- 
ure 8.8b. 

Figure 8.8b shows the audio/video signal 
as it resides within a 6 MHz channel (such as 
channel 3) . The video signal has been lowpass 
filtered, most of the lower sideband has been 
removed, and audio information has been 
added. 

Figure 8.8c details the information present 
on the audio subcarrier for stereo (BTSC) 
operation. 

As shown in Figures 8.9 and 8.10, back 
porch clamping (see glossary) of the analog 
video signal ensures that the back porch level 
is constant, regardless of changes in the aver- 
age picture level. White clipping of the video 
signal prevents the modulated signal from 
going below 10%; below 10% may result in over- 
modulation and buzzing in television receivers. 
The video signal is then lowpass filtered to 4.2 
MHz and drives the AM (amplitude modula- 
tion) video modulator. The sync level corre- 
sponds to 100% modulation, the blanking 
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Figure 8.7. Noninterlaced NTSC Frame Sequence. 



corresponds to 75%, and the white level corre- 
sponds to 10%. (M) NTSC systems use an IF 
(intermediate frequency) for the video of 45.75 
MHz. 

At this point, audio information is added on 
a subcarrier at 41.25 MHz. A monaural audio 
signal is processed as shown in Figure 8.9 and 
drives the FM (frequency modulation) modula- 
tor. The output of the FM modulator is added 
to the IF video signal. 

The SAW filter, used as a vestigial side- 
band filter, provides filtering of the IF signal. 
The mixer, or up converter, mixes the IF signal 
with the desired broadcast frequency. Both 
sum and difference frequencies are generated 
by the mixing process, so the difference signal 
is extracted by using a bandpass filter. 

Stereo Audio (Analog) 

BTSC 

This standard, defined by EIA TVSB5 and 
known as the BTSC system (Broadcast Televi- 
sion Systems Committee), is shown in Figure 
8.10. Countries that use this system include 



the United States, Canada, Mexico, Brazil, and 
Taiwan. 

To enable stereo, L-R information is trans- 
mitted using a suppressed AM subcarrier. A 
SAP (secondary audio program) channel may 
also be present, used to transmit a second lan- 
guage or video description (descriptive audio 
for the visually impaired) . A professional chan- 
nel may also be present, allowing communica- 
tion with remote equipment and people. 

Zweiton M 

This standard (ITU-R BS.707), also known 
as A2 M, is similar to that used with PAL. The 
L+R information is transmitted on an FM sub- 
carrier at 4.5 MHz. The L-R information, or a 
second L+R audio signal, is transmitted on a 
second FM subcarrier at 4.724212 MHz. 

If stereo or dual mono signals are present, 
the FM subcarrier at 4.724212 MHz is ampli- 
tude-modulated with a 55.0699 kHz subcarrier. 
This 55.0699 kHz subcarrier is 50% amplitude- 
modulated at 149.9 Hz to indicate stereo audio 
or 276.0 Hz to indicate dual mono audio. 

This system is used in South Korea. 
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Figure 8.8. Transmission Channel for (M) NTSC. (A) Frequency spectrum of baseband composite 
video. (B) Frequency spectrum of typical channel including audio information. (C) Detailed 
frequency spectrum of BTSC stereo audio information. 
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EIA-J 

This standard is similar to BTSC, and is 
used in Japan. The L+R information is transmit- 
ted on an FM subcarrier at 4.5 MHz. The L-R 
signal, or a second L+R signal, is transmitted 
on a second FM subcarrier at +2F H . 

If stereo or dual mono signals are present, 
a +3.5Fpj subcarrier is amplitude-modulated 
with either a 982.5 Hz subcarrier (stereo 
audio) or a 922.5 Hz subcarrier (dual mono 
audio) . 



Analog Channel Assignments 

Tables 8.1 through 8.4 list the typical chan- 
nel assignments for VHF, UHF, and cable for 
various NTSC systems. 

Note that cable systems routinely reassign 
channel numbers to alternate frequencies to 
minimize interference and provide multiple 
levels of programming (such as regular and 
preview premium movie channels) . 
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Figure 8.9. Typical RF Modulation Implementation for (M) NTSC: Mono Audio. 
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481.75 


476-482 


55 


717.25 


721.75 


716-722 


16 


483.25 


487.75 


482-488 


56 


723.25 


727.75 


722-728 


17 


489.25 


493.75 


488-494 


57 


729.25 


733.75 


728-734 


18 


495.25 


499.75 


494-500 


58 


735.25 


739.75 


734-740 


19 


501.25 


505.75 


500-506 


59 


741.25 


745.75 


740-746 


20 


507.25 


511.75 


506-512 


60 


747.25 


751.75 


746-752 


21 


513.25 


517.75 


512-518 


61 


753.25 


757.75 


752-758 


22 


519.25 


523.75 


518-524 


62 


759.25 


763.75 


758-764 


23 


525.25 


529.75 


524-530 


63 


765.25 


769.75 


764-770 


24 


531.25 


535.75 


530-536 


64 


771.25 


775.75 


770-776 


25 


537.25 


541.75 


536-542 


65 


777.25 


781.75 


776-782 


26 


543.25 


547.75 


542-548 


66 


783.25 


787.75 


782-788 


27 


549.25 


553.75 


548-554 


67 


789.25 


793.75 


788-794 


28 


555.25 


559.75 


554-560 


68 


795.25 


799.75 


794-800 


29 


561.25 


565.75 


560-566 


69 


801.25 


805.75 


800-806 


30 


567.25 


571.75 


566-572 










31 


573.25 


577.75 


572-578 










32 


579.25 


583.75 


578-584 










33 


585.25 


589.75 


584-590 










34 


591.25 


595.75 


590-596 










35 


597.25 


601.75 


596-602 










36 


603.25 


607.75 


602-608 










37 


609.25 


613.75 


608-614 










38 


615.25 


619.75 


614-620 










39 


621.25 


625.75 


620-626 











Table 8.1. Analog Broadcast Nominal Frequencies for North America. 
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Broadcast 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Broadcast 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


_ 


_ 


_ 


_ 


40 


633.25 


637.75 


632-638 


1 


91.25 


95.75 


90-96 


41 


639.25 


643.75 


638-644 


2 


97.25 


101.75 


96-102 


42 


645.25 


649.75 


644-650 


3 


103.25 


107.75 


102-108 


43 


651.25 


655.75 


650-656 


4 


171.25 


175.75 


170-176 


44 


657.25 


661.75 


656-662 


5 


177.25 


181.75 


176-182 


45 


663.25 


667.75 


662-668 


6 


183.25 


187.75 


182-188 


46 


669.25 


673.75 


668-674 


7 


189.25 


193.75 


188-194 


47 


675.25 


679.75 


674-680 


8 


193.25 


197.75 


192-198 


48 


681.25 


685.75 


680-686 


9 


199.25 


203.75 


198-204 


49 


687.25 


691.75 


686-692 


10 


205.25 


209.75 


204-210 


50 


693.25 


697.75 


692-698 


11 


211.25 


215.75 


210-216 


51 


699.25 


703.75 


698-704 


12 


217.25 


221.75 


216-222 


52 


705.25 


709.75 


704-710 


13 


471.25 


475.75 


470-476 


53 


711.25 


715.75 


710-716 


14 


477.25 


481.75 


476-482 


54 


717.25 


721.75 


716-722 


15 


483.25 


487.75 


482-488 


55 


723.25 


727.75 


722-728 


16 


489.25 


493.75 


488-494 


56 


729.25 


733.75 


728-734 


17 


495.25 


499.75 


494-500 


57 


735.25 


739.75 


734-740 


18 


501.25 


505.75 


500-506 


58 


741.25 


745.75 


740-746 


19 


507.25 


511.75 


506-512 


59 


747.25 


751.75 


746-752 


20 


513.25 


517.75 


512-518 


60 


753.25 


757.75 


752-758 


21 


519.25 


523.75 


518-524 


61 


759.25 


763.75 


758-764 


22 


525.25 


529.75 


524-530 


62 


765.25 


769.75 


764-770 


23 


531.25 


535.75 


530-536 


- 


- 


- 


- 


24 


537.25 


541.75 


536-542 


- 


- 


- 


- 


25 


543.25 


547.75 


542-548 


- 


- 


- 


- 


26 


549.25 


553.75 


548-554 


- 


- 


- 


- 


27 


555.25 


559.75 


554-560 


- 


- 


- 


- 


28 


561.25 


565.75 


560-566 


- 


- 


- 


- 


29 


567.25 


571.75 


566-572 


- 


- 


- 


- 


30 


573.25 


577.75 


572-578 










31 


579.25 


583.75 


578-584 










32 


585.25 


589.75 


584-590 










33 


591.25 


595.75 


590-596 










34 


597.25 


601.75 


596-602 










35 


603.25 


607.75 


602-608 










36 


609.25 


613.75 


608-614 










37 


615.25 


619.75 


614-620 










38 


621.25 


625.75 


620-626 










39 


627.25 


631.75 


626-632 











Table 8.2. Analog Broadcast Nominal Frequencies for Japan. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


_ 


_ 


_ 


_ 


40 


319.2625 


323.7625 


318-324 


- 


- 


- 


- 


41 


325.2625 


329.7625 


324-330 


2 


55.25 


59.75 


54-60 


42 


331.2750 


335.7750 


330-336 


3 


61.25 


65.75 


60-66 


43 


337.2625 


341.7625 


336-342 


4 


67.25 


71.75 


66-72 


44 


343.2625 


347.7625 


342-348 


5 


77.25 


81.75 


76-82 


45 


349.2625 


353.7625 


348-354 


6 


83.25 


87.75 


82-88 


46 


355.2625 


359.7625 


354-360 


7 


175.25 


179.75 


174-180 


47 


361.2625 


365.7625 


360-366 


8 


181.25 


185.75 


180-186 


48 


367.2625 


371.7625 


366-372 


9 


187.25 


191.75 


186-192 


49 


373.2625 


377.7625 


372-378 


10 


193.25 


197.75 


192-198 


50 


379.2625 


383.7625 


378-384 


11 


199.25 


203.75 


198-204 


51 


385.2625 


389.7625 


384-390 


12 


205.25 


209.75 


204-210 


52 


391.2625 


395.7625 


390-396 


13 


211.25 


215.75 


210-216 


53 


397.2625 


401.7625 


396-402 


14 


121.2625 


125.7625 


120-126 


54 


403.25 


407.75 


402-408 


15 


127.2625 


131.7625 


126-132 


55 


409.25 


413.75 


408-414 


16 


133.2625 


137.7625 


132-138 


56 


415.25 


419.75 


414-420 


17 


139.25 


143.75 


138-144 


57 


421.25 


425.75 


420-426 


18 


145.25 


149.75 


144-150 


58 


427.25 


431.75 


426-432 


19 


151.25 


155.75 


150-156 


59 


433.25 


437.75 


432-438 


20 


157.25 


161.75 


156-162 


60 


439.25 


443.75 


438-444 


21 


163.25 


167.75 


162-168 


61 


445.55 


449.75 


444-450 


22 


169.25 


173.75 


168-174 


62 


451.25 


455.75 


450-456 


23 


217.25 


221.75 


216-222 


63 


457.25 


461.75 


456-462 


24 


223.25 


227.75 


222-228 


64 


463.25 


467.75 


462-468 


25 


229.2625 


233.7625 


228-234 


65 


469.25 


473.75 


468-474 


26 


235.2625 


239.7625 


234-240 


66 


475.25 


479.75 


474-480 


27 


241.2625 


245.7625 


240-246 


67 


481.25 


485.75 


480-486 


28 


247.2625 


251.7625 


246-252 


68 


487.25 


491.75 


486-492 


29 


253.2625 


257.7625 


252-258 


69 


493.25 


497.75 


492-498 


30 


259.2625 


263.7625 


258-264 


70 


499.25 


503.75 


498-504 


31 


265.2625 


269.7625 


264-270 


71 


505.25 


509.75 


504-510 


32 


271.2625 


275.7625 


270-276 


72 


511.25 


515.75 


510-516 


33 


277.2625 


281.7625 


276-282 


73 


517.25 


521.75 


516-522 


34 


283.2625 


287.7625 


282-288 


74 


523.25 


527.75 


522-528 


35 


289.2625 


293.7625 


288-294 


75 


529.25 


533.75 


528-534 


36 


295.2625 


299.7625 


294-300 


76 


535.25 


539.75 


534-540 


37 


301.2625 


305.7625 


300-306 


77 


541.25 


545.75 


540-546 


38 


307.2625 


311.7625 


306-312 


78 


547.25 


551.75 


546-552 


39 


313.2625 


317.7625 


312-318 


79 


553.25 


557.75 


552-558 



Table 8.3a. Standard Analog Cable TV Nominal Frequencies for USA. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


80 


559.25 


563.75 


558-564 


120 


769.25 


773.75 


768-774 


81 


565.25 


569.75 


564-570 


121 


775.25 


779.75 


774-780 


82 


571.25 


575.75 


570-576 


122 


781.25 


785.75 


780-786 


83 


577.25 


581.75 


576-582 


123 


787.25 


791.75 


786-792 


84 


583.25 


587.75 


582-588 


124 


793.25 


797.75 


792-798 


85 


589.25 


593.75 


588-594 


125 


799.25 


803.75 


798-804 


86 


595.25 


599.75 


594-600 


126 


805.25 


809.75 


804-810 


87 


601.25 


605.75 


600-606 


127 


811.25 


815.75 


810-816 


88 


607.25 


611.75 


606-612 


128 


817.25 


821.75 


816-822 


89 


613.25 


617.75 


612-618 


129 


823.25 


827.75 


822-828 


90 


619.25 


623.75 


618-624 


130 


829.25 


833.75 


828-834 


91 


625.25 


629.75 


624-630 


131 


835.25 


839.75 


834-840 


92 


631.25 


635.75 


630-636 


132 


841.25 


845.75 


840-846 


93 


637.25 


641.75 


636-642 


133 


847.25 


851.75 


846-852 


94 


643.25 


647.75 


642-648 


134 


853.25 


857.75 


852-858 


95 


91.25 


95.75 


90-96 


135 


859.25 


863.75 


858-864 


96 


97.25 


101.75 


96-102 


136 


865.25 


869.75 


864-870 


97 


103.25 


107.75 


102-108 


137 


871.25 


875.75 


870-876 


98 


109.2750 


113.7750 


108-114 


138 


877.25 


881.75 


876-882 


99 


115.2750 


119.7750 


114-120 


139 


883.25 


887.75 


882-888 


100 


649.25 


653.75 


648-654 


140 


889.25 


893.75 


888-894 


101 


655.25 


659.75 


654-660 


141 


895.25 


899.75 


894-900 


102 


661.25 


665.75 


660-666 


142 


901.25 


905.75 


900-906 


103 


667.25 


671.75 


666-672 


143 


907.25 


911.75 


906-912 


104 


673.25 


677.75 


672-678 


144 


913.25 


917.75 


912-918 


105 


679.25 


683.75 


678-684 


145 


919.25 


923.75 


918-924 


106 


685.25 


689.75 


684-690 


146 


925.25 


929.75 


924-930 


107 


691.25 


695.75 


690-696 


147 


931.25 


935.75 


930-936 


108 


697.25 


701.75 


696-702 


148 


937.25 


941.75 


936-942 


109 


703.25 


707.75 


702-708 


149 


943.25 


947.75 


942-948 


110 


709.25 


713.75 


708-714 


150 


949.25 


953.75 


948-954 


111 


715.25 


719.75 


714-720 


151 


955.25 


959.75 


954-960 


112 


721.25 


725.75 


720-726 


152 


961.25 


965.75 


960-966 


113 


727.25 


731.75 


726-732 


153 


967.25 


971.75 


966-972 


114 


733.25 


737.75 


732-738 


154 


973.25 


977.75 


972-978 


115 


739.25 


743.75 


738-744 


155 


979.25 


983.75 


978-984 


116 


745.25 


749.75 


744-750 


156 


985.25 


989.75 


984-990 


117 


751.25 


755.75 


750-756 


157 


991.25 


995.75 


990-996 


118 


757.25 


761.75 


756-762 


158 


997.25 


1001.75 


996-1002 


119 


763.25 


767.75 


762-768 


- 


- 


- 


- 



Table 8.3b. Standard Analog Cable TV Nominal Frequencies for USA. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


_ 


_ 


_ 


40 


319.2625 


323.7625 


1 


73.2625 


77.7625 


41 


325.2625 


329.7625 


2 


55.2625 


59.7625 


42 


331.2750 


335.7750 


3 


61.2625 


65.7625 


43 


337.2625 


341.7625 


4 


67.2625 


71.7625 


44 


343.2625 


347.7625 


5 


79.2625 


83.7625 


45 


349.2625 


353.7625 


6 


85.2625 


89.7625 


46 


355.2625 


359.7625 


7 


175.2625 


179.7625 


47 


361.2625 


365.7625 


8 


181.2625 


185.7625 


48 


367.2625 


371.7625 


9 


187.2625 


191.7625 


49 


373.2625 


377.7625 


10 


193.2625 


197.7625 


50 


379.2625 


383.7625 


11 


199.2625 


203.7625 


51 


385.2625 


389.7625 


12 


205.2625 


209.7625 


52 


391.2625 


395.7625 


13 


211.2625 


215.7625 


53 


397.2625 


401.7625 


14 


121.2625 


125.7625 


54 


403.2625 


407.7625 


15 


127.2625 


131.7625 


55 


409.2625 


413.7625 


16 


133.2625 


137.7625 


56 


415.2625 


419.7625 


17 


139.2625 


143.7625 


57 


421.2625 


425.7625 


18 


145.2625 


149.7625 


58 


427.2625 


431.7625 


19 


151.2625 


155.7625 


59 


433.2625 


437.7625 


20 


157.2625 


161.7625 


60 


439.2625 


443.7625 


21 


163.2625 


167.7625 


61 


445.2625 


449.7625 


22 


169.2625 


173.7625 


62 


451.2625 


455.7625 


23 


217.2625 


221.7625 


63 


457.2625 


461.7625 


24 


223.2625 


227.7625 


64 


463.2625 


467.7625 


25 


229.2625 


233.7625 


65 


469.2625 


473.7625 


26 


235.2625 


239.7625 


66 


475.2625 


479.7625 


27 


241.2625 


245.7625 


67 


481.2625 


485.7625 


28 


247.2625 


251.7625 


68 


487.2625 


491.7625 


29 


253.2625 


257.7625 


69 


493.2625 


497.7625 


30 


259.2625 


263.7625 


70 


499.2625 


503.7625 


31 


265.2625 


269.7625 


71 


505.2625 


509.7625 


32 


271.2625 


275.7625 


72 


511.2625 


515.7625 


33 


277.2625 


281.7625 


73 


517.2625 


521.7625 


34 


283.2625 


287.7625 


74 


523.2625 


527.7625 


35 


289.2625 


293.7625 


75 


529.2625 


533.7625 


36 


295.2625 


299.7625 


76 


535.2625 


539.7625 


37 


301.2625 


305.7625 


77 


541.2625 


545.7625 


38 


307.2625 


311.7625 


78 


547.2625 


551.7625 


39 


313.2625 


317.7625 


79 


553.2625 


557.7625 



Table 8.3c. Analog Cable TV Nominal Frequencies for USA: Incrementally 
Related Carrier (IRC) Systems. 





NTSC Overview 275 



Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


80 


559.2625 


563.7625 


120 


769.2625 


773.7625 


81 


565.2625 


569.7625 


121 


775.2625 


779.7625 


82 


571.2625 


575.7625 


122 


781.2625 


785.7625 


83 


577.2625 


581.7625 


123 


787.2625 


791.7625 


84 


583.2625 


587.7625 


124 


793.2625 


797.7625 


85 


589.2625 


593.7625 


125 


799.2625 


803.7625 


86 


595.2625 


599.7625 


126 


805.2625 


809.7625 


87 


601.2625 


605.7625 


127 


811.2625 


815.7625 


88 


607.2625 


611.7625 


128 


817.2625 


821.7625 


89 


613.2625 


617.7625 


129 


823.2625 


827.7625 


90 


619.2625 


623.7625 


130 


829.2625 


833.7625 


91 


625.2625 


629.7625 


131 


835.2625 


839.7625 


92 


631.2625 


635.7625 


132 


841.2625 


845.7625 


93 


637.2625 


641.7625 


133 


847.2625 


851.7625 


94 


643.2625 


647.7625 


134 


853.2625 


857.7625 


95 


91.2625 


95.7625 


135 


859.2625 


863.7625 


96 


97.2625 


101.7625 


136 


865.2625 


869.7625 


97 


103.2625 


107.7625 


137 


871.2625 


875.7625 


98 


109.2750 


113.7750 


138 


877.2625 


881.7625 


99 


115.2625 


119.7625 


139 


883.2625 


887.7625 


100 


649.2625 


653.7625 


140 


889.2625 


893.7625 


101 


655.2625 


659.7625 


141 


895.2625 


899.7625 


102 


661.2625 


665.7625 


142 


901.2625 


905.7625 


103 


667.2625 


671.7625 


143 


907.2625 


911.7625 


104 


673.2625 


677.7625 


144 


913.2625 


917.7625 


105 


679.2625 


683.7625 


145 


919.2625 


923.7625 


106 


685.2625 


689.7625 


146 


925.2625 


929.7625 


107 


691.2625 


695.7625 


147 


931.2625 


935.7625 


108 


697.2625 


701.7625 


148 


937.2625 


941.7625 


109 


703.2625 


707.7625 


149 


943.2625 


947.7625 


110 


709.2625 


713.7625 


150 


949.2625 


953.7625 


111 


715.2625 


719.7625 


151 


955.2625 


959.7625 


112 


721.2625 


725.7625 


152 


961.2625 


965.7625 


113 


727.2625 


731.7625 


153 


967.2625 


971.7625 


114 


733.2625 


737.7625 


154 


973.2625 


977.7625 


115 


739.2625 


743.7625 


155 


979.2625 


983.7625 


116 


745.2625 


749.7625 


156 


985.2625 


989.7625 


117 


751.2625 


755.7625 


157 


991.2625 


995.7625 


118 


757.2625 


761.7625 


158 


997.2625 


1001.7625 


119 


763.2625 


767.7625 


- 


- 


- 



Table 8.3d. Analog Cable TV Nominal Frequencies for USA: Incrementally 
Related Carrier (IRC) Systems. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


_ 


_ 


_ 


40 


318.0159 


322.5159 


1 


72.0036 


76.5036 


41 


324.0162 


328.5162 


2 


54.0027 


58.5027 


42 


330.0165 


334.5165 


3 


60.0030 


64.5030 


43 


336.0168 


340.5168 


4 


66.0033 


70.5030 


44 


342.0168 


346.5168 


5 


72.0036 


82.5039 


45 


348.0168 


352.5168 


6 


78.0039 


88.5042 


46 


354.0168 


358.5168 


7 


174.0087 


178.5087 


47 


360.0168 


364.5168 


8 


180.0090 


184.5090 


48 


366.0168 


370.5168 


9 


186.0093 


190.5093 


49 


372.0168 


376.5168 


10 


192.0096 


196.5096 


50 


378.0168 


382.5168 


11 


198.0099 


202.5099 


51 


384.0168 


388.5168 


12 


204.0102 


208.5102 


52 


390.0168 


394.5168 


13 


210.0105 


214.5105 


53 


396.0168 


400.5168 


14 


120.0060 


124.5060 


54 


402.0201 


406.5201 


15 


126.0063 


130.5063 


55 


408.0204 


412.5204 


16 


132.0066 


136.5066 


56 


414.0207 


418.5207 


17 


138.0069 


142.5069 


57 


420.0210 


424.5210 


18 


144.0072 


148.5072 


58 


426.0213 


430.5213 


19 


150.0075 


154.5075 


59 


432.0216 


436.5216 


20 


156.0078 


160.5078 


60 


438.0219 


442.5219 


21 


162.0081 


166.5081 


61 


444.0222 


448.5222 


22 


168.0084 


172.5084 


62 


450.0225 


454.5225 


23 


216.0108 


220.5108 


63 


456.0228 


460.5228 


24 


222.0111 


226.5111 


64 


462.0231 


466.5231 


25 


228.0114 


232.5114 


65 


468.0234 


472.5234 


26 


234.0117 


238.5117 


66 


474.0237 


478.5237 


27 


240.0120 


244.5120 


67 


480.0240 


484.5240 


28 


246.0123 


250.5123 


68 


486.0243 


490.5243 


29 


252.0126 


256.5126 


69 


492.0246 


496.5246 


30 


258.0129 


262.5129 


70 


498.0249 


502.5249 


31 


264.0132 


268.5132 


71 


504.0252 


508.5252 


32 


270.0135 


274.5135 


72 


510.0255 


514.5255 


33 


276.0138 


280.5138 


73 


516.0258 


520.5258 


34 


282.0141 


286.5141 


74 


522.0261 


526.5261 


35 


288.0144 


292.5144 


75 


528.0264 


532.5264 


36 


294.0147 


298.5147 


76 


534.0267 


538.5267 


37 


300.0150 


304.5150 


77 


540.0270 


544.5270 


38 


306.0153 


310.5153 


78 


546.0273 


550.5273 


39 


312.0156 


316.5156 


79 


552.0276 


556.5276 



Table 8.3e. Analog Cable TV Nominal Frequencies for USA: Harmonically 
Related Carrier (HRC) Systems. 





NTSC Overview 277 



Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


80 


558.0279 


562.5279 


120 


768.0384 


772.5384 


81 


564.0282 


568.5282 


121 


774.0387 


778.5387 


82 


570.0285 


574.5285 


122 


780.0390 


784.5390 


83 


576.0288 


580.5288 


123 


786.0393 


790.5393 


84 


582.0291 


586.5291 


124 


792.0396 


796.5396 


85 


588.0294 


592.5294 


125 


798.0399 


802.5399 


86 


594.0297 


598.5297 


126 


804.0402 


808.5402 


87 


600.0300 


604.5300 


127 


810.0405 


814.5405 


88 


606.0303 


610.5303 


128 


816.0408 


820.5408 


89 


612.0306 


616.5306 


129 


822.0411 


826.5411 


90 


618.0309 


622.5309 


130 


828.0414 


832.5414 


91 


624.0312 


628.5312 


131 


834.0417 


838.5417 


92 


630.0315 


634.5315 


132 


840.0420 


844.5420 


93 


636.0318 


640.5318 


133 


846.0423 


850.5423 


94 


642.0321 


646.5321 


134 


852.0426 


856.5426 


95 


90.0045 


94.5045 


135 


858.0429 


862.5429 


96 


96.0048 


100.5048 


136 


864.0432 


868.5432 


97 


102.0051 


106.5051 


137 


870.0435 


874.5435 


98 


- 


- 


138 


876.0438 


880.5438 


99 


- 


- 


139 


882.0441 


888.5441 


100 


648.0324 


652.5324 


140 


888.0444 


892.5444 


101 


654.0327 


658.5327 


141 


894.0447 


898.5447 


102 


660.0330 


664.5330 


142 


900.0450 


904.5450 


103 


666.0333 


670.5333 


143 


906.0453 


910.5453 


104 


672.0336 


676.5336 


144 


912.0456 


916.5456 


105 


678.0339 


682.5339 


145 


918.0459 


922.5459 


106 


684.0342 


688.5342 


146 


924.0462 


928.5462 


107 


690.0345 


694.5345 


147 


930.0465 


934.5465 


108 


696.0348 


700.5348 


148 


936.0468 


940.5468 


109 


702.0351 


706.5351 


149 


942.0471 


946.5471 


110 


708.0354 


712.5354 


150 


948.0474 


952.5474 


111 


714.0357 


718.5357 


151 


954.0477 


958.5477 


112 


720.0360 


724.5360 


152 


960.0480 


964.5480 


113 


726.0363 


730.5363 


153 


966.0483 


970.5483 


114 


732.0366 


736.5366 


154 


972.0486 


976.5486 


115 


738.0369 


742.5369 


155 


978.0489 


982.5489 


116 


744.0372 


748.5372 


156 


984.0492 


988.5492 


117 


750.0375 


754.5375 


157 


990.0495 


994.5495 


118 


756.0378 


760.5378 


158 


996.0498 


1000.5498 


119 


762.0381 


766.5381 


- 


- 


- 



Table 8.3f. Analog Cable TV Nominal Frequencies for USA: Harmonically 
Related Carrier (HRC) Systems. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


_ 


_ 


_ 


_ 


40 


325.25 


329.75 


324-330 


- 


- 


- 


- 


41 


331.25 


335.75 


330-336 


- 


- 


- 


- 


42 


337.25 


341.75 


336-342 


13 


109.25 


113.75 


108-114 


43 


343.25 


347.75 


342-348 


14 


115.25 


119.75 


114-120 


44 


349.25 


353.75 


348-354 


15 


121.25 


125.75 


120-126 


45 


355.25 


359.75 


354-360 


16 


127.25 


131.75 


126-132 


46 


361.25 


365.75 


360-366 


17 


133.25 


137.75 


132-138 


47 


367.25 


371.75 


366-372 


18 


139.25 


143.75 


138-144 


48 


373.25 


377.75 


372-378 


19 


145.25 


149.75 


144-150 


49 


379.25 


383.75 


378-384 


20 


151.25 


155.75 


150-156 


50 


385.25 


389.75 


384-390 


21 


157.25 


161.75 


156-162 


51 


391.25 


395.75 


390-396 


22 


165.25 


169.75 


164-170 


52 


397.25 


401.75 


396-402 


23 


223.25 


227.75 


222-228 


53 


403.25 


407.75 


402-408 


24 


231.25 


235.75 


230-236 


54 


409.25 


413.75 


408-414 


25 


237.25 


241.75 


236-242 


55 


415.25 


419.75 


414-420 


26 


243.25 


247.75 


242-248 


56 


421.25 


425.75 


420-426 


27 


249.25 


253.75 


248-254 


57 


427.25 


431.75 


426-432 


28 


253.25 


257.75 


252-258 


58 


433.25 


437.75 


432-438 


29 


259.25 


263.75 


258-264 


59 


439.25 


443.75 


438-444 


30 


265.25 


269.75 


264-270 


60 


445.25 


449.75 


444-450 


31 


271.25 


275.75 


270-276 


61 


451.25 


455.75 


450-456 


32 


277.25 


281.75 


276-282 


62 


457.25 


461.75 


456-462 


33 


283.25 


287.75 


282-288 


63 


463.25 


467.75 


462-468 


34 


289.25 


293.75 


288-294 


- 


- 


- 


- 


35 


295.25 


299.75 


294-300 


- 


- 


- 


- 


36 


301.25 


305.75 


300-306 


- 


- 


- 


- 


37 


307.25 


311.75 


306-312 


- 


- 


- 


- 


38 


313.25 


317.75 


312-318 


- 


- 


- 


- 


39 


319.25 


323.75 


318-324 


- 


- 


- 


- 



Table 8.4. Analog Cable TV Nominal Frequencies for Japan. 
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Luminance Equation Derivation 

The equation for generating luminance 
from RGB is determined by the chromaticities 
of the three primary colors used by the 
receiver and what color white actually is. 

The chromaticities of the RGB primaries 
and reference white (CIE illuminate C) were 
specified in the 1953 NTSC standard to be: 



R: 


x r = 0.67 


y r = 0.33 


z r = 0.00 


G: 


x g = 0.21 


y g = o. 7 i 


z g = 0.08 


B: 


x b = 0.14 


y b = o.08 


z b = 0.78 


white: x w = 0.3101 y w = 
Zy, = 0.3737 


0.3162 



where x and y are the specified CIE 1931 chro- 
maticity coordinates; z is calculated by know- 
ing that x + y + z = 1. 

Luminance is calculated as a weighted sum 
of RGB, with the weights representing the 
actual contributions of each of the RGB prima- 
ries in generating the luminance of reference 
white. We find the linear combination of RGB 
that gives reference white by solving the equa- 
tion: 



x g x b 




Kr 




x w /y w 


y r y g y b 




K g 


= 


i 


%r %g Zfj 




K b 




x w^yw 



Substituting the known values gives us the 
solution for K r , Kg, and K b : 



Kr 




0.3101/0.3162 




0.67 0.21 


0.14 


K g 


= 


1 




0.33 0.71 


0.08 


K b 




0.3737/0.3 162 




0.00 0.08 0.78J 



0.9807 




1 




1.1818 





1.730 -0.482 -0.261 



0.083 -0.169 1.284 



0.906 

0.827 

1.430 



Y is defined to be 

Y= (K r y r )R' + (Kgy g )G' + (K b y b )B' 

= (0.906) (0.33) R'+ (0.827) (0.71) G' 
+ (1.430) (0.08)B' 



or 



Y = 0.299R' + 0.587G' + 0.114B' 

Modern receivers use a different set of 
RGB phosphors, resulting in slightly different 
chromaticities of the RGB primaries and refer- 
ence white (CIE illuminate D 65 ): 



Rearranging to solve for K r , K g , and K|, yields: 



Kr 




*Jy w 




Xy Xg X b 


K g 


= 


i 




y r y g y b 


K b 




x u y y w 




X r x g x b 



R: x r = 0.630 


y r = 0.340 


z r = 0.030 


G: x g = 0.310 


y g = 0.595 


Zg = 0.095 


B: x b = 0.155 


y b = 0.070 


z b = 0.775 


white: x w = 0.3127 


y w = 0.3290 





z w = 0.3583 



where x and y are the specified CIE 1931 chro- 
maticity coordinates; z is calculated by know- 
ing that x + y + z = 1. Once again, substituting 
the known values gives us the solution for K r , 
Kg, and K b : 
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A r 0.3127/0.3290 0.630 0.310 0.155 
K g = 1 0.340 0.595 0.070 

K b [o. 3583/0. 329oJ [o.030 0.095 0.775_ 

0.6243 
= 1.1770 

_1.2362_ 



Since Y is defined to be 

Y= (K r y r )R' + (Kgy g ) G ' + (K b y b )B' 

= (0.6243) (0.340) R' + (1.1770) (0.595) G' 

+ (1.2362) (0.070) B' 

this results in: 

Y = 0.212R' + 0.700G' + 0.086B' 

However, the standard Y = 0.299R' + 
0.587G' + 0.114B' equation is still used. Adjust- 
ments are made in the receiver to minimize 
color errors. 



PAL Overview 

Europe delayed adopting a color television 
standard, evaluating various systems between 
1953 and 1967 that were compatible with their 
625-line, 50-field-per-second, 2:1 interlaced 
monochrome standard. The NTSC specifica- 
tion was modified to overcome the high order 
of phase and amplitude integrity required dur- 
ing broadcast to avoid color distortion. The 
Phase Alternation Line (PAL) system imple- 
ments a line-by-line reversal of the phase of 
one of the color components, originally relying 
on the eye to average any color distortions to 
the correct color. Broadcasting began in 1967 
in Germany and the United Kingdom, with 
each using a slightly different variant of the 
PAL system. 

Luminance Information 

The monochrome luminance (Y) signal is 
derived from R G B ': 

Y = 0.299R' + 0.587G' + 0.114B' 

As with NTSC, the luminance signal occu- 
pies the entire video bandwidth. PAL has sev- 
eral variations, depending on the video 
bandwidth and placement of the audio subcar- 
rier. The composite video signal has a band- 
width of 4.2, 5.0, 5.5, or 6.0 MHz, depending on 
the specific PAL standard. 
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Color Information 

To transmit color information, U and V are 
used: 

U = 0.492 (B'-Y) 

V= 0.877 (R'-Y) 

U and V have a typical bandwidth of 1.3 MHz. 

Color Modulation 

As in the NTSC system, U and V are used 
to modulate the color subcarrier using two bal- 
anced modulators operating in phase quadra- 
ture: one modulator is driven by the subcarrier 
at sine phase; the other modulator is driven by 
the subcarrier at cosine phase. The outputs of 
the modulators are added together to form the 
modulated chrominance signal: 

C = U sin cot ± V cos cot 
co = 2jtF sc 

F sc = 4.43361875 MHz (± 5 Hz) 
for (B, D, G, H, I, N) PAL 

F sc = 3.58205625 MHz (± 5 Hz) for (N c ) PAL 

F sc = 3.57561149 MHz (± 10 Hz) for (M) PAL 

In PAL, the phase of V is reversed every 
other line. V was chosen for the reversal pro- 
cess since it has a lower gain factor than U and 
therefore is less susceptible to a one-half f h 
switching rate imbalance. The result of alter- 
nating the V phase at the line rate is that any 
color subcarrier phase errors produce comple- 
mentary errors, allowing line-to-line averaging 
at the receiver to cancel the errors and gener- 
ate the correct hue with slightly reduced satu- 
ration. This technique requires the PAL 
receiver to be able to determine the correct V 
phase. This is done using a technique known 
as AB sync, PAL sync, PAL switch, or swinging 



burst, consisting of alternating the phase of the 
color burst by +45° at the line rate. The UV 
vector diagrams are shown in Figures 8.11 and 
8 . 12 . 

Simple PAL decoders rely on the eye to 
average the line-by-line hue errors. Standard 
PAL decoders use a 1H delay line to separate U 
from V in an averaging process. Both imple- 
mentations have the problem of Hanover bars, 
in which pairs of adjacent lines have a real and 
complementary hue error. Chrominance verti- 
cal resolution is reduced as a result of the line 
averaging process. 

Composite Video Generation 

The modulated chrominance is added to 
the luminance information along with appropri- 
ate horizontal and vertical sync signals, blank- 
ing signals, and color burst signals, to generate 
the composite color video waveform shown in 
Figure 8.13. 

composite PAL = Y + U sin cot 
± V cos cot + timing 

The bandwidth of the resulting composite 
video signal is shown in Figure 8.14. 

Like NTSC, the luminance components are 
spaced at F H intervals due to horizontal blank- 
ing. Since the V component is switched sym- 
metrically at one-half the line rate, only odd 
harmonics are generated, resulting in V com- 
ponents that are spaced at intervals of F H . The 
V components are spaced at half-line intervals 
from the U components, which also have F^ 
spacing. If the subcarrier had a half-line offset 
like NTSC uses, the U components would be 
perfectly interleaved, but the V components 
would coincide with the Y components and 
thus not be interleaved, creating vertical sta- 
tionary dot patterns. For this reason, PAL uses 
a 1/4 line offset for the subcarrier frequency: 
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IRE SCALE UNITS 




BLUE 

347 ° 



Figure 8.11. UV Vector Diagram for 75% Color Bars. Line [n], 
PAL switch = 0. 




BLUE 

13 ° 



Figure 8.12. UV Vector Diagram for 75% Color Bars. Line [n 
+ 1], PAL switch = 1. 
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100 IRE 

COLOR BURST 
(10 ± 1 CYCLES) 





21.43 IRE 






21.43 IRE 








43 IRE 







WHITE LEVEL 



BLACK /BLANK LEVEL 



SYNC LEVEL 





PHASE = HUE 



Figure 8.13. (B, D, G, H, I, N c ) PAL Composite Video Signal for 75% Color Bars. 
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AMPLITUDE 



CHROMINANCE 

SUBCARRIER 




FREQUENCY 

(MHZ) 



(I) PAL 



AMPLITUDE 



CHROMINANCE 

SUBCARRIER 










Y U 

±V 


±v 


i i i i i 



0.0 1.0 2.0 3.0 4.0 4.43 5.0 



FREQUENCY 

(MHZ) 



(B, G, H) PAL 



Figure 8.14. Video Bandwidths of Some PAL Systems. 



FH/4 



Y 



Y 



U V U 




FH/2 



FH 



V 



F 



Figure 8.15. Luma and Chroma Frequency Interleave Principle. 
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F sc = ((1135/4) + (1/625)) F h 
for (B, D, G, H, I, N) PAL 

F sc = (909/4) F h for (M) PAL 

F sc = ((917/4) + (1/625)) F H for (N c ) PAL 

The additional (1/625) F H factor (equal to 
25 Hz) provides motion to the color dot pat- 
tern, reducing its visibility. Figure 8.15 illus- 
trates the resulting frequency interleaving. 
Eight complete fields are required to repeat a 
specific sample position, as shown in Figures 
8.16 and 8.17. 

PAL Standards 

Figure 8.19 shows the common designa- 
tions for PAL systems. The letters refer to the 
monochrome standard for line and field rate, 
video bandwidth (4.2, 5.0, 5.5, or 6.0 MHz), 
audio carrier relative frequency, and RF chan- 
nel bandwidth (6.0, 7.0, or 8.0 MHz). PAL 
refers to the technique to add color informa- 
tion to the monochrome signal. Detailed tim- 
ing parameters may be found in Table 8.9. 

Noninterlaced PAL, shown in Figure 8.18, 
is a 312-line, 50-frames-per-second version of 
PAL common among video games and on- 
screen displays. This format is identical to 
standard PAL, except that there are 312 lines 
per frame. 

RF Modulation 

Figures 8.20 and 8.21 illustrate the process 
of converting baseband (G) PAL composite 
video to an RF (radio frequency) signal. The 
process for the other PAL standards is similar, 
except primarily for the different video band- 
widths and subcarrier frequencies. 



Figure 8.20a shows the frequency spec- 
trum of a (G) PAL baseband composite video 
signal. It is similar to Figure 8.14. However, 
Figure 8.14 only shows the upper sideband for 
simplicity. The video carrier notation at 0 MHz 
serves only as a reference point for compari- 
son with Figure 8.20b. 

Figure 8.20b shows the audio/video signal 
as it resides within an 8 MHz channel. The 
video signal has been lowpass filtered, most of 
the lower sideband has been removed, and 
audio information has been added. Note that 
(H) and (I) PAL have a vestigial sideband of 
1.25 MHz, rather than 0.75 MHz. 

Figure 8.20c details the information 
present on the audio subcarrier for analog ste- 
reo operation. 

As shown in Figure 8.21, back porch 
clamping of the analog video signal ensures 
that the back porch level is constant, regard- 
less of changes in the average picture level. 
The video signal is then lowpass filtered to 5.0 
MHz and drives the AM (amplitude modula- 
tion) video modulator. The sync level corre- 
sponds to 100% modulation; the blanking and 
white modulation levels are dependent on the 
specific version of PAL: 

blanking level (% modulation) 

B, G 75% 

D, H, M, N 75% 

I 76% 

white level (% modulation) 

B, G, H, M, N 10% 

D 10% 

I 20% 

Note that PAL systems use a variety of 
video and audio IF frequencies (values in 
MHz): 
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START 

OF 

VSYNC 



\ V \ \ 



ANALOG 
FIELD 1 



620 621 



622 623 



624 625 



Kj 






-U COMPONENT OF BURST PHASE 



ANALOG 
FIELD 2 



308 309 310 311 312 313 314 315 316 317 318 319 320 



336 337 



/\ A 



ANALOG 
FIELD 3 



620 621 622 623 624 625 



23 24 



ANALOG 
FIELD 4 



308 309 310 311 312 313 314 315 316 317 318 319 320 



336 337 



FIELD ONE 



BURST 

BLANKING 

INTERVALS 



FIELD THREE 



FIELD FOUR 



BURST PHASE = REFERENCE PHASE = 135“ RELATIVE TO U 
PAL SWITCH = 0, +V COMPONENT 



BURST PHASE = REFERENCE PHASE + 90° = 225° RELATIVE TO U 
PAL SWITCH = 1, -V COMPONENT 



Figure 8.16a. Eight-Field (B, D, G, H, I, N c ) PAL Sequence and Burst Blanking, 
See Figure 8.5 for equalization and serration pulse details. 
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START 

OF 

VSYNC 





FIELD FIVE 



FIELD SIX 



FIELD SEVEN 



FIELD EIGHT 



BURST 

BLANKING 

INTERVALS 




BURST PHASE = REFERENCE PHASE = 135° RELATIVE TO U 
PAL SWITCH = 0, +V COMPONENT 



BURST PHASE = REFERENCE PHASE + 90° = 225° RELATIVE TO U 
PAL SWITCH = 1, -V COMPONENT 



Figure 8.16b. Eight-Field (B, D, G, H, I, N c ) PAL Sequence and Burst Blanking. 
See Figure 8.5 for equalization and serration pulse details. 
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START 

OF 

VSYNC 





BURST PHASE = REFERENCE PHASE = 135° RELATIVE TO U 
PAL SWITCH = 0, +V COMPONENT 



BURST PHASE = REFERENCE PHASE + 90° = 225° RELATIVE TO U 
PAL SWITCH a 1, -V COMPONENT 



Figure 8.17. Eight-Field (M) PAL Sequence and Burst Blanking. 
See Figure 8.5 for equalization and serration pulse details. 
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START 

OF 

VSYNC 




BURST PHASE = REFERENCE PHASE = 135" RELATIVE TO U 
PAL SWITCH = 0, +V COMPONENT 



BURST PHASE = REFERENCE PHASE + 90" = 225" RELATIVE TO U 
PAL SWITCH = 1, -V COMPONENT 



Figure 8.18. Noninterlaced PAL Frame Sequence. 





video 


audio 




B, G 


38.900 


33.400 




B 


36.875 


31.375 


Australia 


D 


37.000 


30.500 


China 


D 


38.900 


32.400 


OIRT 


I 


38.900 


32.900 




I 


39.500 


33.500 


U.K. 


M, N 


45.750 


41.250 





At this point, audio information is added on 
the audio subcarrier. A monaural L+R audio 
signal is processed as shown in Figure 8.21 
and drives the FM (frequency modulation) 
modulator. The output of the FM modulator is 
added to the IF video signal. 

The SAW filter, used as a vestigial side- 
band filter, provides filtering of the IF signal. 
The mixer, or up converter, mixes the IF signal 
with the desired broadcast frequency. Both 
sum and difference frequencies are generated 
by the mixing process, so the difference signal 
is extracted by using a bandpass filter. 



Stereo Audio (Analog) 

The standard (ITU-R BS.707) , also known 
as Zweiton or A2, is shown in Figure 8.21. The 
L+R information is transmitted on a FM sub- 
carrier. The R information, or a second L+R 
audio signal, is transmitted on a second FM 
subcarrier at +15.5 Fjj. 

If stereo or dual mono signals are present, 
the FM subcarrier at +15.5 Fjj is amplitude- 
modulated with a 54.6875 kHz (3.5 Fh) subcar- 
rier. This 54.6875 kHz subcarrier is 50% ampli- 
tude-modulated at 117.5 Hz (Fh/ 133) to 
indicate stereo audio or 274.1 Hz (F^/57) to 
indicate dual mono audio. 

Countries that use this system include 
Australia, Austria, China, Germany, Italy, 
Malaysia, Netherlands, Slovenia, and Switzer- 
land. 

Stereo Audio (Digital) 

The standard uses NICAM 728 (Near 
Instantaneous Companded Audio Multiplex), 
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Figure 8.19. Common PAL Systems 
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VIDEO 

CARRIER 




FREQUENCY (MHZ) 



(A) 



VIDEO 

CARRIER 




FREQUENCY (MHZ) 



6.75 



(B) 



AUDIO 

CARRIER FH = 15,625 HZ 




FREQUENCY 



(C) 



Figure 8.20. Transmission Channel for (G) PAL. (A) Frequency spectrum of baseband composite 
video. (B) Frequency spectrum of typical channel including audio information. (C) Detailed 
frequency spectrum of Zweiton analog stereo audio information. 
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discussed within BS.707 and ETSI EN 300 163. 
It was developed by the BBC and IBA to 
increase sound quality, provide multiple chan- 
nels of digital sound or data, and be more resis- 
tant to transmission interference. 

The subcarrier resides either 5.85 MHz 
above the video carrier for (B, D, G, H) PAL 
and (L) SECAM systems or 6.552 MHz above 
the video carrier for (I) PAL systems. 

Countries that use NICAM 728 include 
Belgium, China, Denmark, Finland, France, 
Hungary, New Zealand, Norway, Singapore, 
South Africa, Spain, Sweden, and the United 
Kingdom. 

NICAM 728 is a digital system that uses a 
32 kHz sampling rate and 14-bit resolution. A 
bit-rate of 728 kbps is used, giving it the name 



NICAM 728. Data is transmitted in frames, 
with each frame containing 1 ms of audio. As 
shown in Figure 8.22, each frame consists of: 



8-bit frame alignment word (01001110) 
5 control bits (C0-C4) 

11 undefined bits (AD0-AD10) 

704 audio data bits (A000-A703) 



CO is a “1” for eight successive frames and 
a “0” for the next eight frames, defining a 16- 
frame sequence. C1-C3 specify the format 
transmitted: “000” = one stereo signal with the 
left channel being odd-numbered samples and 
the right channel being even-numbered sam- 
ples, “010” = two independent mono channels 
transmitted in alternate frames, “100” = one 
mono channel and one 352 kbps data channel 



117.5 HZ 
274.1 HZ 



AUDIO RIGHT 
AUDIO LEFT 



(G) PAL 
COMPOSITE 
VIDEO 




38.9 MHZ 

IF VIDEO CARRIER 



BANDWIDTH 



CHANNEL 

SELECT 



MODULATED RF 
AUDIO / VIDEO 



Figure 8.21. Typical RF Modulation Implementation for (G) PAL: Zweiton Stereo Audio. 
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transmitted in alternate frames, “110” = one 
704 kbps data channel. C4 is a “1” if the analog 
sound is the same as the digital sound. 

Stereo Audio Encoding 

The 32 14-bit samples (1 ms of audio, 2’s 
complement format) per channel are pre- 
emphasized to the ITU-T J.17 curve. 

The largest positive or negative sample of 
the 32 is used to determine which 10 bits of all 
32 samples to transmit. Three range bits per 
channel (RO^ R1 l, R2 r , and R0 r , R1 r , R2 r ) are 
used to indicate the scaling factor. D13 is the 
sign bit (“0” = positive) . 



D13-D0 


R2-R0 


Bits 


Used 


0 lxxxxxxxxxxxx 


111 


D13 , 


D12-D4 


001 xxxxxxxxxxx 


110 


D13 , 


D11-D3 


OOOlxxxxxxxxxx 


101 


D13 , 


D10-D2 


00001 xxxxxxxxx 


011 


D13 , 


D9-D1 


OOOOOlxxxxxxxx 


101 


D13 , 


D8-D0 


0000001 xxxxxxx 


010 


D13 , 


D8-D0 


OOOOOOOxxxxxxx 


OOx 


D13 , 


D8-D0 


lllllllxxxxxxx 


OOx 


D13 , 


D8-D0 


llllllOxxxxxxx 


010 


D13 , 


D8-D0 


11111 Oxxxxxxxx 


100 


D13 , 


D8-D0 


llllOxxxxxxxxx 


011 


D13 , 


D9-D1 


111 Oxxxxxxxxxx 


101 


D13 , 


D10-D2 


1 1 Oxxxxxxxxxxx 


110 


D13 , 


D11-D3 


1 0 xxxxxxxxxxxx 


111 


D13 , 


D12-D4 


FRAME 








ALIGNMENT 


CONTROL 






WORD 


BITS 




ADDITIONAL 



A parity bit for the six MSBs of each sam- 
ple is added, resulting in each sample being 11 
bits. The 64 samples are interleaved, generat- 
ing L0, R0, LI, Rl, L2, R2, ..., L31, R31, and 
numbered 0-63. 

The parity bits are used to convey to the 
decoder what scaling factor was used for each 
channel (“signaling-in-parity”) . 

If R2 l = “0,” even parity for samples 0, 6, 12, 

18, ..., 48 is used. If R2 l = “1,” odd parity is 
used. 

If R2r = “0,” even parity for samples 1, 7, 13, 

19, ..., 49 is used. If R2r = “1,” odd parity is 
used. 

If R1 l = “0,” even parity for samples 2, 8, 14, 

20, ..., 50 is used. If R1 l = “1,” odd parity is 
used. 

If R1 r = “0,” even parity for samples 3, 9, 15, 

21, ..., 51 is used. If RIr = “1,” odd parity is 
used. 

If R0 l = “0,” even parity for samples 4, 10, 

16, 22, ..., 52 is used. If ROl = “1,” odd parity is 
used. 

If R0 r = “0,” even parity for samples 5, 11, 

17, 23, ..., 53 is used. If R0r = “1,” odd parity is 
used. 



704 BITS 

DATA BITS AUDIO DATA 



0, 1, 0, 0, 1, 1, 1, 0, CO, Cl, C2, C3, C4, ADO, ADI, AD2, AD3, AD4, AD5, AD6, AD7, AD8, AD9, AD10, A000, A044, A088, ... A660, 

A001, A045, A089, ... A661, 
A002, A046, A090, ... A662, 
A003, A047, A091 .... A663, 



A043, A087, A131, ... A703 



Figure 8.22. NICAM 728 Bitstream for One Frame. 
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The parity of samples 54-63 is normally 
even. However, they may be modified to trans- 
mit two additional bits of information: 

If CIBO = “0,” even parity for samples 54, 55, 

56, 57, and 58 is used. If CIBO = “1,” odd parity 
is used. 

If CIB1 = “0,” even parity for samples 59, 60, 

61, 62, and 63 is used. If CIB1 = “1,” odd parity 
is used. 

The audio data is bit-interleaved as shown 
in Figure 8.22 to reduce the influence of drop- 
outs. If the bits are numbered 0-703, they are 
transmitted in the order 0, 44, 88, ..., 660, 1, 45, 
89, ..., 661, 2, 46, 90, ..., 703. 

The whole frame, except the frame align- 
ment word, is exclusive-ORed with a l-bit 
pseudo-random binary sequence (PRBS). The 
PRBS generator is reinitialized after the frame 
alignment word of each frame so that the first 
bit of the sequence processes the CO bit. The 
polynomial of the PRBS is x 9 + x 4 + 1 with an 
initialization word of “111111111.” 

Actual transmission consists of taking bits 
in pairs from the 728 kbps bitstream, then gen- 
erating 356k symbols per second using Differ- 
ential Quadrature Phase-Shift Keying 
(DQPSK). If the symbol is “00,” the subcarrier 
phase is left unchanged. If the symbol is “01,” 
the subcarrier phase is delayed 90°. If the sym- 
bol is “11,” the subcarrier phase is inverted. If 
the symbol is “10,” the subcarrier phase is 
advanced 90°. 

Finally, the signal is spectrum-shaped to a 
-30 dB bandwidth of -700 kHz for (I) PAL or 
-500 kHz for (B, G) PAL. 



Stereo Audio Decoding 

A PLL locks to the NICAM subcarrier fre- 
quency and recovers the phase changes that 
represent the encoded symbols. The symbols 
are decoded to generate the 728 kbps bit- 
stream. 

The frame alignment word is found and 
the following bits are exclusive-ORed with a 
locally generated PRBS to recover the packet. 
The CO bit is tested for 8 frames high, 8 frames 
low behavior to verify it is a NICAM 728 bit- 
stream. 

The bit-interleaving of the audio data is 
reversed, and the signaling-in-parity decoded: 

A majority vote is taken on the parity of 
samples 0, 6, 12, ..., 48. If even, R2 l = “0”; if 
odd, R2 l = “1.” 

A majority vote is taken on the parity of 
samples 1, 7, 13, ..., 49. If even, R2r = “0”; if 
odd, R2r = “1.” 

A majority vote is taken on the parity of 
samples 2, 8, 14, ..., 50. If even, R1 l = “0”; if 
odd, R1 l = “1.” 

A majority vote is taken on the parity of 
samples 3, 9, 15, ..., 51. If even, RIr = “0”; if 
odd, RIr = “1.” 

A majority vote is taken on the parity of 
samples 4, 10, 16, ..., 52. If even, R0 l = “0”; if 
odd, R0 l = “1.” 

A majority vote is taken on the parity of 
samples 5, 11, 17, ..., 53. If even, R0 r = “0”; if 
odd, R0 r = “1.” 




PAL Overview 295 



A majority vote is taken on the parity of 
samples 54, 55, 56, 57, and 58. If even, CIBO = 

“0”; if odd, CIBO = “1.” 

A majority vote is taken on the parity of 
samples 59, 60, 61, 62, and 63. If even, CIB1 = 

“0”; if odd, CIB1 = “1.” 

Any samples whose parity disagreed with 
the vote are ignored and replaced with an 
interpolated value. 

The left channel uses range bits R2j , R1 L , 
and ROl to determine which bits below the 
sign bit were discarded during encoding. The 
sign bit is duplicated into those positions to 
generate a 14-bit sample. 

The right channel is similarly processed, 
using range bits R2g, R1 r , and ROp. Both chan- 
nels are then de-emphasized using the J.17 
curve. 

Dual Mono Audio Encoding 

Two blocks of 32 14-bit samples (2 ms of 
audio, 2’s complement format) are pre-empha- 
sized to the ITU-T J.17 specification. As with 
the stereo audio, three range bits per block 
(R0 A , R1 a , R2 a , and ROg, Rig, R2g) are used to 
indicate the scaling factor. Unlike stereo audio, 
the samples are not interleaved. 

If R2 a = “0,” even parity for samples 0, 3, 6, 

9, ..., 24 is used. If R2 A = “1,” odd parity is used. 

If R2 B = “0,” even parity for samples 27, 30, 

33, ..., 51 is used. If R2g = “1,” odd parity is 
used. 

If R1 A = “0,” even parity for samples 1, 4, 7, 

10, ..., 25 is used. If R1 A = “1,” odd parity is 
used. 

If Rig = “0,” even parity for samples 28, 31, 

34, ..., 52 is used. If Rig = “1,” odd parity is 
used. 



If R0 A = “0,” even parity for samples 2, 5, 8, 

11, ..., 26 is used. If R0 A = “1,” odd parity is 
used. 

If R0g = “0,” even parity for samples 29, 32, 

35, ..., 53 is used. If R0g = “1,” odd parity is 
used. 

The audio data is bit-interleaved; however, 
odd packets contain 64 samples of audio chan- 
nel 1 while even packets contain 64 samples of 
audio channel 2. The rest of the processing is 
the same as for stereo audio. 

Analog Channel Assignments 

Tables 8.5 through 8.7 list the channel 
assignments for VHF, UHF, and cable for vari- 
ous PAL systems. 

Note that cable systems routinely reassign 
channel numbers to alternate frequencies to 
minimize interference and provide multiple 
levels of programming (such as two versions of 
a premium movie channel: one for subscribers, 
and one for nonsubscribers during pre-view 
times) . 

Luminance Equation Derivation 

The equation for generating luminance 
from RGB information is determined by the 
chromaticities of the three primary colors 
used by the receiver and what color white actu- 
ally is. 

The chromaticities of the RGB primaries 
and reference white (CIE illuminate D 65 ) are: 



R: x r = 0.64 


y r = 0.33 


z r = 0.03 


G: x g = 0.29 


y g = 0.60 


z g = 0.11 


B: xj, = 0.15 


y b = 0.06 


zj, = 0.79 


white: x w = 0.3127 y w = 


0.3290 


Zy, = 0.3583 
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Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


(B) PAL, Australia, 7 MHz Channel 


(B) PAL, Italy, 7 MHz Channel 


0 


46.25 


51.75 


45-52 


A 


53.75 


59.25 


52 . 5 - 59.5 


1 


57.25 


62.75 


56-63 


B 


62.25 


67.75 


61-68 


2 


64.25 


69.75 


63-70 


C 


82.25 


87.75 


81-88 


3 


86.25 


91.75 


85-92 


D 


175.25 


180.75 


174-181 


4 


95.25 


100.75 


94-101 


E 


183.75 


189.25 


182 . 5 - 189.5 


5 


102.25 


107.75 


101-108 


F 


192.25 


197.75 


191-198 


5A 


138.25 


143.75 


137-144 


G 


201.25 


206.75 


200-207 


6 


175.25 


180.75 


174-181 


H 


210.25 


215.75 


209-216 


7 


182.25 


187.75 


181-188 


H-l 


217.25 


222.75 


216-223 


8 


189.25 


194.75 


188-195 


H-2 


224.25 


229.75 


223-230 


9 


196.25 


201.75 


195-202 


- 


- 


- 


- 


10 


209.25 


214.75 


208-215 


- 


- 


- 


- 


11 


216.25 


221.75 


215-222 


- 


- 


- 


- 


12 


223.25 






- 


- 


- 


- 


(1) PAL, Ireland, 8 MHz Channel 


(B) PAL, New Zealand, 7 MHz Channel 


1 


45.75 


51.75 


44 . 5 - 52.5 


1 


45.25 


50.75 


44-51 


2 


53.75 


59.75 


52 . 5 - 60.5 


2 


55.25 


60.75 


54-61 


3 


61.75 


67.75 


60 . 5 - 68.5 


3 


62.25 


67.75 


61-68 


4 


175.25 


181.25 


174-182 


4 


175.25 


180.75 


174-181 


5 


183.25 


189.25 


182-190 


5 


182.25 


187.75 


181-188 


6 


191.25 


197.25 


190-198 


6 


189.25 


194.75 


188-195 


7 


199.25 


205.25 


198-206 


7 


196.25 


201.75 


195-202 


8 


207.25 


213.25 


206-214 


8 


203.25 


208.75 


202-209 


9 


215.25 


221.25 


214-222 


9 


210.25 


215.75 


209-216 



Table 8.5. Analog Broadcast and Cable TV Nominal Frequencies for (B, I) PAL in Various Countries 
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Broadcast 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


(G, H) PAL 


(1) PAL 


2 1 


45.75 


51.25 


51.75 


44 . 5 - 52.5 


3 1 


53.75 


59.25 


59.75 


52 . 5 - 60.5 


4 1 


61.75 


67.25 


67.75 


60 . 5 - 68.5 


5 1 


175.25 


180.75 


181.25 


174-182 


6 1 


183.25 


188.75 


189.25 


182-190 


7 1 


191.25 


196.75 


197.25 


190-198 


8 1 


199.25 


204.75 


205.25 


198-206 


9 1 


207.25 


212.75 


213.25 


206-214 


10 1 


215.25 


220.75 


221.25 


214-222 


2 2 


48.25 


53.75 


_ 


47-54 


3 2 


55.25 


60.75 


- 


54-61 


4 2 


62.25 


67.75 


- 


61-68 


5 2 


175.25 


180.75 


- 


174-181 


6 2 


182.25 


187.75 


- 


181-188 


7 2 


189.25 


194.75 


- 


188-195 


8 2 


196.25 


201.75 


- 


195-202 


9 2 


203.25 


208.75 


- 


202-209 


10 2 


210.25 


215.75 


- 


209-216 


ll 2 


217.25 


222.75 


- 


216-223 


12 2 


224.25 


229.75 


- 


223-230 


21 


471.25 


476.75 


477.25 


470-478 


22 


479.25 


484.75 


485.25 


478-486 


23 


487.25 


492.75 


493.25 


486-494 


24 


495.25 


500.75 


501.25 


494-502 


25 


503.25 


508.75 


509.25 


502-510 


26 


511.25 


516.75 


517.25 


510-518 


27 


519.25 


524.75 


525.25 


518-526 


28 


527.25 


532.75 


533.25 


526-534 


29 


535.25 


540.75 


541.25 


534-542 


30 


543.25 


548.75 


549.25 


542-550 


31 


551.25 


556.75 


557.25 


550-558 


32 


559.25 


564.75 


565.25 


558-566 


33 


567.25 


572.75 


573.25 


566-574 


34 


575.25 


580.75 


581.25 


574-582 


35 


583.25 


588.75 


589.25 


582-590 


36 


591.25 


596.75 


597.25 


590-598 


37 


599.25 


604.75 


605.25 


598-606 


38 


607.25 


612.75 


613.25 


606-614 


39 


615.25 


620.75 


621.25 


614-622 



Table 8.6a. Analog Broadcast Nominal Frequencies for the 
Hjnited Kingdom, Ireland, 1 South Africa, 1 Hong Kong, and 
2 Western Europe. 
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Broadcast 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


(G, H) PAL 


(1) PAL 


40 


623.25 


628.75 


629.25 


622-630 


41 


631.25 


636.75 


637.25 


630-638 


42 


639.25 


644.75 


645.25 


638-646 


43 


647.25 


652.75 


653.25 


646-654 


44 


655.25 


660.75 


661.25 


654-662 


45 


663.25 


668.75 


669.25 


662-670 


46 


671.25 


676.75 


677.25 


670-678 


47 


679.25 


684.75 


685.25 


678-686 


48 


687.25 


692.75 


693.25 


686-694 


49 


695.25 


700.75 


701.25 


694-702 


50 


703.25 


708.75 


709.25 


702-710 


51 


711.25 


716.75 


717.25 


710-718 


52 


719.25 


724.75 


725.25 


718-726 


53 


727.25 


732.75 


733.25 


726-734 


54 


735.25 


740.75 


741.25 


734-742 


55 


743.25 


748.75 


749.25 


742-750 


56 


751.25 


756.75 


757.25 


750-758 


57 


759.25 


764.75 


765.25 


758-766 


58 


767.25 


772.75 


773.25 


766-774 


59 


775.25 


780.75 


781.25 


774-782 


60 


783.25 


788.75 


789.25 


782-790 


61 


791.25 


796.75 


797.25 


790-798 


62 


799.25 


804.75 


805.25 


798-806 


63 


807.25 


812.75 


813.25 


806-814 


64 


815.25 


820.75 


821.25 


814-822 


65 


823.25 


828.75 


829.25 


822-830 


66 


831.25 


836.75 


837.25 


830-838 


67 


839.25 


844.75 


845.25 


838-846 


68 


847.25 


852.75 


853.25 


846-854 


69 


855.25 


860.75 


861.25 


854-862 



Table 8.6b. Analog Broadcast Nominal Frequencies for the 
United Kingdom, Ireland, South Africa, Hong Kong, and 
Western Europe. 
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Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


Cable 

Channel 


Video 

Carrier 

(MHz) 


Audio 

Carrier 

(MHz) 


Channel 

Range 

(MHz) 


E 2 


48.25 


53.75 


47-54 


S 11 


231.25 


236.75 


230-237 


E 3 


55.25 


60.75 


54-61 


S 12 


238.25 


243.75 


237-244 


E 4 


62.25 


67.75 


61-68 


S 13 


245.25 


250.75 


244-251 


S 01 


69.25 


74.75 


68-75 


S 14 


252.25 


257.75 


251-258 


S 02 


76.25 


81.75 


75-82 


S 15 


259.25 


264.75 


258-265 


S 03 


83.25 


88.75 


82-89 


S 16 


266.25 


271.75 


265-272 


S 1 


105.25 


110.75 


104-111 


S 17 


273.25 


278.75 


272-279 


S 2 


112.25 


117.75 


111-118 


S 18 


280.25 


285.75 


279-286 


S3 


119.25 


124.75 


118-125 


S 19 


287.25 


292.75 


286-293 


S 4 


126.25 


131.75 


125-132 


S 20 


294.25 


299.75 


293-300 


S 5 


133.25 


138.75 


132-139 


S 21 


303.25 


308.75 


302-310 


S 6 


140.75 


145.75 


139-146 


S 22 


311.25 


316.75 


310-318 


S 7 


147.75 


152.75 


146-153 


S 23 


319.25 


324.75 


318-326 


S 8 


154.75 


159.75 


153-160 


S 24 


327.25 


332.75 


326-334 


S 9 


161.25 


166.75 


160-167 


S 25 


335.25 


340.75 


334-342 


S 10 


168.25 


173.75 


167-174 


S 26 


343.25 


348.75 


342-350 


- 


- 


- 


- 


S 27 


351.25 


356.75 


350-358 


- 


- 


- 


- 


S 28 


359.25 


364.75 


358-366 


- 


- 


- 


- 


S 29 


367.25 


372.75 


366-374 


E 5 


175.25 


180.75 


174-181 


S 30 


375.25 


380.75 


374-382 


E 6 


182.25 


187.75 


181-188 


S 31 


383.25 


388.75 


382-390 


E 7 


189.25 


194.75 


188-195 


S 32 


391.25 


396.75 


390-398 


E 8 


196.25 


201.75 


195-202 


S 33 


399.25 


404.75 


398-406 


E 9 


203.25 


208.75 


202-209 


S 34 


407.25 


412.75 


406-414 


E 10 


210.25 


215.75 


209-216 


S 35 


415.25 


420.75 


414-422 


E 11 


217.25 


222.75 


216-223 


S 36 


423.25 


428.75 


422-430 


E 12 


224.25 


229.75 


223-230 


S 37 


431.25 


436.75 


430-438 


- 


- 


- 


- 


S 38 


439.25 


444.75 


438-446 


- 


- 


- 


- 


S 39 


447.25 


452.75 


446-454 


- 


- 


- 


- 


S 40 


455.25 


460.75 


454-462 


- 


- 


- 


- 


S 41 


463.25 


468.75 


462-470 



Table 8.7. Analog Cable TV Nominal Frequencies for the United Kingdom, Ireland, South Africa 
Hong Kong, and Western Europe. 
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where x and y are the specified CIE 1931 chro- 
maticity coordinates; z is calculated by know- 
ing that x + y + z = 1. 

As with NTSC, substituting the known val- 
ues gives us the solution for K p K g , and K b : 



letterboxed image with 430 active lines. On 
PALplus TVs, it is displayed as a 16:9 picture 
with 574 active lines, with extended vertical 
resolution. The full video bandwidth is avail- 
able for luminance detail. Cross color artifacts 
are reduced by clean encoding. 



K 




0.3127/0.3290 




0.64 0.29 0.15 


K g 


= 


1 




0.33 


0.60 0.06 


K b 




_0. 3583/0. 3290_ 




0.03 


0.11 0.79_ 



0.674 

1.177 

1.190 



Y is defined to be 

Y= (K r y r )R' + (Kgy g )G' + (K b y b )B' 

= (0.674) (0.33)R'+ (1.177) (0.60)G' 
+ (1.190) (0.06)B' 



Wide Screen Signaling 

Line 23 contains a widescreen signaling 
(WSS) control signal, defined by ITU-R 
BT.1119 and ETSI EN 300 294, used by PAL- 
plus TVs. This signal indicates: 

Program Aspect Ratio: 

Full Format 4:3 
Letterbox 14:9 Center 
Letterbox 14:9 Top 
Full Format 14:9 Center 
Letterbox 16:9 Center 
Letterbox 16:9 Top 
Full Format 16:9 Anamorphic 
Letterbox > 16:9 Center 



or 

Y = 0.222R' + 0.706G' + 0.071B' 



Enhanced services: 
Camera Mode 
Film Mode 



However, the standard Y = 0.299R' + 
0.587G' + 0.114B' equation is still used. Adjust- 
ments are made in the receiver to minimize 
color errors. 

PALplus 

PALplus (ITU-R BT.1197 and ETSI ETS 
300 731) is the result of a cooperative project 
started in 1990, undertaken by several Euro- 
pean broadcasters. By 1995, they wanted to 
provide an enhanced definition television sys- 
tem (EDTV), compatible with existing receiv- 
ers. PALplus has been transmitted by a few 
broadcasters since 1994. 

A PALplus picture has a 16:9 aspect ratio. 
On conventional TVs, it is displayed as a 16:9 



Subtitles: 

Teletext Subtitles Present 
Open Subtitles Present 

PALplus is defined as being letterbox 16:9 
center, camera mode or film mode, helper sig- 
nals present using modulation, and clean 
encoding used. Teletext subtitles may or may 
not be present, and open subtitles may be 
present only in the active picture area. 

During a PALplus transmission, any active 
video on lines 23 and 623 is blanked prior to 
encoding. In addition to WSS data, line 23 
includes 48+1 cycles of a 300 +9 mV subcar- 
rier with a -U phase, starting 51 ps +250 ns 
after Op. Line 623 contains a 10 ps +250 ns 
white pulse, starting 20 ps +250 ns after Op. 
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A PALplus TV has the option of deinterlac- 
ing a film mode signal and displaying it on a 50 
Hz progressive-scan display or using field 
repeating on a 100 Hz interlaced display. 

Ghost Cancellation 

An optional ghost cancellation signal on 
line 318, defined by ITU-R BT.1124 and ETSI 
ETS 300 732, allows a suitably adapted TV to 
measure the ghost signal and cancel any 
ghosting during the active video. A PALplus 
TV may or may not support this feature. 

Vertical Filtering 

All PALplus sources start out as a 16:9 
YCbCr anamorphic image, occupying all 576 
active scan lines. Any active video on lines 23 
and 623 is blanked prior to encoding (since 
these lines are used for WSS and reference 
information), resulting in 574 active lines per 
frame. Lines 24-310 and 336-622 are used for 
active video. 

Before transmission, the 574 active scan 
lines of the 16:9 image are squeezed into 430 
scan lines. To avoid aliasing problems, the ver- 
tical resolution is reduced by lowpass filtering. 

For Y, vertical filtering is done using a 
quadrature mirror filter (QMF) highpass and 
lowpass pair. Using the QMF process allows 
the highpass and lowpass information to be 
resampled, transmitted, and later recombined 
with minimal loss. 

The Y QMF lowpass output is resampled 
into three-quarters of the original height; little 
information is lost to aliasing. After clean 
encoding, it is the letterboxed signal that con- 
ventional 4:3 TVs display. 



The Y QMF highpass output contains the 
rest of the original vertical frequency. It is used 
to generate the helper signals that are trans- 
mitted using the black scan lines not used by 
the letterbox picture. 

Film Mode 

A film mode broadcast has both fields of a 
frame coming from the same image, as is usu- 
ally the case with a movie scanned on a tele- 
cine. 

In film mode, the maximum vertical reso- 
lution per frame is about 287 cycles per active 
picture height (cph), limited by the 574 active 
scan lines per frame. 

The vertical resolution of Y is reduced to 
215 cph so it can be transmitted using only 430 
active lines. The QMF lowpass and highpass 
filters split the Y vertical information into DC- 
215 cph and 216-287 cph. 

The Y lowpass information is re-scanned 
into 430 lines to become the letterbox image. 
Since the vertical frequency is limited to a 
maximum of 215 cph, no information is lost. 

The Y highpass output is decimated so 
only one in four lines is transmitted. These 144 
lines are used to transmit the helper signals. 
Because of the QMF process, no information is 
lost to decimation. 

The 72 lines above and 72 lines below the 
central 430-line of the letterbox image are used 
to transmit the 144 lines of the helper signal. 
This results in a standard 574 active line pic- 
ture, but with the original image in its correct 
aspect ratio, centered between the helper sig- 
nals. The scan lines containing the 300 mV 
helper signals are modulated using the U sub- 
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carrier so they look black and are not visible to 
the viewer. 

After Fixed ColorPlus processing, the 574 
scan lines are PAL encoded and transmitted as 
a standard interlaced PAL frame. 

Camera Mode 

Camera (or video) mode assumes the 
fields of a frame are independent of each other, 
as would be the case when a camera scans a 
scene in motion. Therefore, the image may 
have changed between fields. Only intra-field 
processing is done. 

In camera mode, the maximum vertical 
resolution per field is about 143 cycles per 
active picture height (cph) , limited by the 287 
active scan lines per field. 

The vertical resolution of Y is reduced to 
107 cph so it can be transmitted using only 215 
active lines. The QMF lowpass and highpass 
filter pair split the Y vertical information into 
DC-107 cph and 108-143 cph. 

The Y lowpass information is re-scanned 
into 215 lines to become the letterbox image. 
Since the vertical frequency is limited to a 
maximum of 107 cph, no information is lost. 

The Y highpass output is decimated so 
only one in four lines is transmitted. These 72 
lines are used to transmit the helper signals. 
Because of the QMF process, no information is 
lost to decimation. 

The 36 lines above and 36 lines below the 
central 215 line of the letterbox image are used 
to transmit the 72 lines of the helper signal. 
This results in a 287 active line picture, but 
with the original image in its correct aspect 
ratio, centered between the helper signals. The 
scan lines containing the 300 mV helper sig- 
nals are modulated using the U subcarrier so 
they look black and are not visible to the 
viewer. 



After either Fixed or Motion Adaptive Col- 
orPlus processing, the 287 scan lines are PAL 
encoded and transmitted as a PAL field. 

Clean Encoding 

Only the letterboxed portion of the PAL- 
plus signal is clean encoded. The helper sig- 
nals are not actual PAL video. However, they 
are close enough to video to pass through the 
transmission path and remain fairly invisible 
on standard TVs. 

ColorPlus Processing 

Fixed ColorPlus 

Film mode uses a Fixed ColorPlus tech- 
nique, making use of the lack of motion 
between the two fields of the frame. 

Fixed ColorPlus depends on the subcar- 
rier phase of the composite PAL signal being of 
opposite phase when 312 lines apart. If these 
two lines have the same luminance and 
chrominance information, it can be separated 
by adding and subtracting the composite sig- 
nals from each other. Adding cancels the 
chrominance, leaving luminance. Subtracting 
cancels the luminance, leaving chrominance. 

In practice, Y information above 3 MHz 
(Y H p) is intra-frame averaged since it shares 
the frequency spectrum with the modulated 
chrominance. For line [n], Yjjp is calculated as 
follows: 

0 < n < 214 for 430-line letterboxed image 

Y HF(60 + n) = °- 5 <YhF( 372 + n) + Y HF(60 + n)) 

Y HF(372 + n) = Y HF(60 + n) 

Y H p is then added to the low-frequency Y 
(Y lf ) information. The same intra-frame aver- 
aging process is also used for Cb and Cr. The 
430-line letterbox image is then PAL encoded. 
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Thus, Y information above 3 MHz, and 
CbCr information, is the same on lines [n] and 
[n+312]. Y information below 3 MHz may be 
different on lines [n] and [n+312]. The full ver- 
tical resolution of 287 cph is reconstructed by 
the decoder with the aid of the helper signals. 

Motion Adaptive ColorPlus (MACP) 

Camera mode uses either Motion Adaptive 
ColorPlus or Fixed ColorPlus, depending on 
the amount of motion between fields. This 
requires a motion detector in both the encoder 
and decoder. 

To detect movement, the CbCr data on 
lines [n] and [n+312] are compared. If they 
match, no movement is assumed, and Fixed 
ColorPlus operation is used. If the CbCr data 
doesn’t match, movement is assumed, and 
Motion Adaptive ColorPlus operation is used. 

During Motion Adaptive ColorPlus opera- 
tion, the amount of Ypp added to Ypp is depen- 
dent on the difference between CbCr( n ) and 
CbCr( n+ 3 i 2 )- For the maximum CbCr differ- 
ence, no Yhp data for lines [n] and [n+312] is 
transmitted. 

In addition, the amount of intra-frame aver- 
aged CbCr data mixed with the direct CbCr 
data is dependent on the difference between 
CbCr( n ) and CbCr( n+31 2 ). For the maximum 
CbCr difference, only direct CbCr data is trans- 
mitted separately for lines [n] and [n+312]. 



SECAM Overview 

SECAM (Sequentiel Couleur Avec Mem- 
oire, or Sequential Color with Memory) was 
developed in France, with broadcasting start- 
ing in 1967, by realizing that, if color could be 
bandwidth-limited horizontally, why not also 
vertically? The two pieces of color information 
(Db and Dr) added to the monochrome signal 



could be transmitted on alternate lines, avoid- 
ing the possibility of crosstalk. 

The receiver requires memory to store 
one line so that it is concurrent with the next 
line, and also requires the addition of a line- 
switching identification technique. 

Like PAL, SECAM is a 625-line, 50-field- 
per-second, 2:1 interlaced system. SECAM was 
adopted by other countries; however, many are 
changing to PAL due to the abundance of pro- 
fessional and consumer PAL equipment. 

Luminance Information 

The monochrome luminance (Y) signal is 
derived from R G B ': 

Y = 0.299R' + 0.587G' + 0.114B' 

As with NTSC and PAL, the luminance sig- 
nal occupies the entire video bandwidth. 
SECAM has several variations, depending on 
the video bandwidth and placement of the 
audio subcarrier. The video signal has a band- 
width of 5.0 or 6.0 MHz, depending on the spe- 
cific SECAM standard. 

Color Information 

SECAM transmits Db information during 
one line and Dr information during the next 
line; luminance information is transmitted 
each line. Db and Dr are scaled versions of B ' 
-YandR'-Y: 

Dr = -1.902 (R'-Y) 

Db = 1.505 (B'-Y) 

Since there is an odd number of lines, any 
given line contains Db information on one field 
and Dr information on the next field. The 
decoder requires a 1H delay, switched syn- 
chronously with the Db and Dr switching, so 
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that Db and Dr exist simultaneously in order to 
convert to YCbCr or RGB. 

Color Modulation 

SECAM uses FM modulation to transmit 
the Db and Dr color difference information, 
with each component having its own subcar- 
rier. 

Db and Dr are lowpass filtered to 1.3 MHz 
and pre-emphasis is applied. The curve for the 
pre-emphasis is expressed by: 




A = 




where / = signal frequency in kHz. 

After pre-emphasis, Db and Dr frequency 
modulate their respective subcarriers. The fre- 
quency of each subcarrier is defined as: 

F ob = 272 F H = 4.250000 MHz (± 2 kHz) 

F or = 282 F h = 4.406250 MHz (± 2 kHz) 

These frequencies represent no color 
information. Nominal Dr deviation is +280 kHz 
and the nominal Db deviation is +230 kHz. Fig- 
ure 8.23 illustrates the frequency modulation 
process of the color difference signals. The 
choice of frequency shifts reflects the idea of 
keeping the frequencies representing critical 
colors away from the upper limit of the spec- 
trum to minimize distortion. 

After modulation of Db and Dr, subcarrier 
pre-emphasis is applied, changing the ampli- 
tude of the subcarrier as a function of the fre- 
quency deviation. The intention is to reduce 
the visibility of the subcarriers in areas of low 
luminance and to improve the signal-to-noise 



ratio of highly saturated colors. This pre- 
emphasis is given as: 

G = M 1 V 16F 
1 +j\.26F 

where F = (//4286) - (4286 If), f = instanta- 
neous subcarrier frequency in kHz, and 2M = 
23 + 2.5% of luminance amplitude. 

As shown in Figure 8.24, Db and Dr infor- 
mation is transmitted on alternate scan lines. 
The initial phase of the color subcarrier is also 
modified as shown in Table 8.8 to further 
reduce subcarrier visibility. Note that subcar- 
rier phase information in the SECAM system 
carries no picture information. 

Composite Video Generation 

The subcarrier data is added to the lumi- 
nance along with appropriate horizontal and 
vertical sync signals, blanking signals, and 
burst signals to generate composite video. 

As with PAL, SECAM requires some 
means of identifying the line-switching 
sequence. Modern practice has been to use an 
For/Fob burst after most horizontal syncs to 
derive the switching synchronization informa- 
tion, as shown in Figure 8.25. 

SECAM Standards 

Figure 8.26 shows the common designa- 
tions for SECAM systems. The letters refer to 
the monochrome standard for line and field 
rates, video bandwidth (5.0 or 6.0 MHz) , audio 
carrier relative frequency, and RF channel 
bandwidth. The SECAM refers to the tech- 
nique to add color information to the mono- 
chrome signal. Detailed timing parameters 
may be found in Table 8.9. 
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Figure 8.23. SECAM FM Color Modulation. 



The initial phase subcarrier undergoes in each line a variation defined by 



Frame to frame: 0°, 180°, 0°, 180° ... 

Line to line: 0°, 0°. 180°, 0°, 0°, 180° ... or 0°, 0°, 0°, 180°, 180°, 180° ... 



Table 8.8. SECAM Subcarrier Timing. 
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START 

OF 

VSYNC 



DR DB DR DB 



ANALOG 
FIELD 1 



DR DB 

A. A, 

' \ T- 



620 621 622 623 624 625 1 2 3 4 5 



DB DR DB 



ANALOG 
FIELD 2 



DB DR 



308 309 310 311 312 313 314 315 316 317 318 319 320 



336 337 



DB DR DB DR 



ANALOG 
FIELD 3 



DB DR 

A Av 



620 621 622 623 624 625 



23 24 



DR DB DR 

Av Av 

Z_\ / N 



ANALOG 
FIELD 4 



DR DB 

/ \ /_ 



308 309 310 311 312 313 314 315 316 317 318 319 320 



336 337 



Figure 8.24. Four-Field SECAM Sequence. See Figure 8.5 for equalization and serration pulse 
details. 




BLANK LEVEL 



SYNC LEVEL 



HORIZONTAL 

BLANKING 



Figure 8.25. SECAM Chroma Synchronization Signals 
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Luminance Equation Derivation 

The equation for generating luminance 
from RGB information is determined by the 
chromaticities of the three primary colors 
used by the receiver and what color white actu- 
ally is. 



The chromaticities of the RGB primaries 
and reference white (CIE illuminate D 65 ) are: 



R: x r = 0.64 


y r = 0.33 


z r = 0.03 


G: x g = 0.29 


y g = 0.60 


z g = 0.11 


B: xjj = 0.15 


y b = o.06 


Z], = 0.79 


white: x w = 0.3127 y w = 


0.3290 


Zy, = 0.3583 






Figure 8.26. Common SECAM Systems. 
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where x and y are the specified CIE 1931 chro- 
maticity coordinates; z is calculated by know- 
ing that x + y + z = 1. Once again, substituting 
the known values gives us the solution for K r , 
K g , and K b : 



Y is defined to be 

Y= (K r y r )R' + (Kgy g )G' + (K b y b )B' 

= (0.674) (0.33)R'+ (1.177) (0.60)G' 
+ (1.190) (0.06)B' 



Kr 




0.3127/0.3290 




0.64 0.29 0.15 


K g 


— 


1 




0.33 0.60 0.06 


K b 




_0. 3583/0. 3290_ 




0.03 0.11 0.79_ 



0.674 

1.177 

1.190 



or 



Y = 0.222R' + 0.706G' + 0.071B' 

However, the standard Y = 0.299R' + 
0.587G' + 0.114B' equation is still used. Adjust- 
ments are made in the receiver to minimize 
color errors. 
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M 


N 


B, G 


H 


1 


D, K 


K1 


L 


SCAN LINES PER FRAME 


525 


625 


625 


FIELD FREQUENCY 
(FIELDS /SECOND) 


59.94 


50 


50 


LINE FREQUENCY (HZ) 


15,734 


15,625 


15,625 


PEAK WHITE LEVEL (IRE) 


100 


100 


100 


SYNC TIP LEVEL (IRE) 


-40 


-40 (-43) 


-43 


SETUP (IRE) 


7.5 ± 2.5 


7.5 ±2.5 (0) 


0 


PEAK VIDEO LEVEL (IRE) 


120 




133 




133 


115 


115 


125 


GAMMA OF RECEIVER 


2.2 


2.8 


2.8 


2.8 


2.8 


2.8 


2.8 


2.8 


VIDEO BANDWIDTH (MHZ) 


4.2 


5.0 (4.2) 


5.0 


5.0 


5.5 


6.0 


6.0 


6.0 


LUMINANCE SIGNAL 


Y = 0.299R' + 0.587G' + 0.114B' (RGB ARE GAMMA-CORRECTED) 



1 Values in parentheses apply to (Nq) PAL used in Argentina. 

Table 8.9a. Basic Characteristics of Color Video Signals. 
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Characteristics 


M 


N 


B, D, G, H, 1 
K, Kl, L, N c 


Nominal line period (pis) 


63.5555 


64 


64 


Line blanking interval (pis) 


10.7 + 0.1 


10.88 + 0.64 


11.85 + 0.15 


Ojj to start of active video (pis) 


9.2 ± 0.1 


9.6 + 0.64 


10.5 


Front porch (pis) 


1.5 + 0.1 


1.92 + 0.64 


1.65 + 0.15 


Line synchronizing pulse (pis) 


4.7 + 0.1 


4.99 + 0.77 


4.7 + 0.2 


Rise and fall time of line 
blanking (10%, 90%) (ns) 


140 + 20 


300 + 100 


300 + 100 


Rise and fall time of line 
synchronizing pulses (10%, 90%) (ns) 


140 + 20 


<250 


250 + 50 



Notes: 

1. Ojj is at 50% point of falling edge of horizontal sync. 

2. In case of different standards having different specifications and tolerances, the tightest specifi- 
cation and tolerance are listed. 

3. Timing is measured between half-amplitude points on appropriate signal edges. 



Table 8.9b. Details of Line Synchronization Signals. 
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Characteristics 


M 


N 


B, D, G, H, 1 
K, Kl, L, N c 


Field period (ms) 


16.6833 


20 


20 


Field blanking interval 


20 lines 


19-25 lines 


25 lines 


Rise and fall time of field blanking 
(10%, 90%) (ns) 


140 + 20 


<250 


300 ± 100 


Duration of equalizing and 
synchronizing sequences 


3 H 


3 H 


2.5 H 


Equalizing pulse width (ps) 


2.3 + 0.1 


2.43 ± 0.13 


2.35 + 0.1 


Serration pulse width (ps) 


4.7 + 0.1 


4.7 ± 0.8 


4.7 ±0.1 


Rise and fall time of synchronizing and 
equalizing pulses (10%, 90%) (ns) 


140 + 20 


<250 


250 ± 50 



Notes: 

1. In case of different standards having different specifications and tolerances, the tightest specification 
and tolerance are listed. 

2. Timing is measured between half-amplitude points on appropriate signal edges. 



Table 8.9c. Details of Field Synchronization Signals. 
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M / NTSC 


M/PAL 


B, D, G, H, 1, N / PAL 


B, D, G, K, K1, K/ SECAM 


ATTENUATION OF COLOR 
DIFFERENCE SIGNALS 


U, V, 1, Q: 

< 2 DB AT 1.3 MHZ 

> 20 DB AT 3.6 MHZ 

OR Q: 

< 2 DB AT 0.4 MHZ 

< 6 DB AT 0.5 MHZ 

> 6 DB AT 0.6 MHZ 


< 2 DB AT 1.3 MHZ 
> 20 DB AT 3.6 MHZ 


< 3 DB AT 1.3 MHZ 
> 20 DB AT 4 MHZ 
(> 20 DB AT 3.6 MHZ) 


< 3 DB AT 1.3 MHZ 
> 30 DB AT 3.5 MHZ 

(BEFORE LOW-FREQUENCY 
PRE-CORRECTION) 


START OF BURST 
AFTER OH (|iS) 


5.3 ± 0.07 


5.8 ± 0.1 


5.6 ±0.1 




BURST DURATION (CYCLES) 


9 ± 1 


9 ± 1 


10 ± 1 (9 ± 1) 




BURST PEAK AMPLITUDE 


40 ± 1 IRE 


42.86 ± 4 IRE 


42.86 ± 4 IRE 





Note : Values in parentheses apply to (N<_-) PAL used in Argentina. 



Table 8.9d. Basic Characteristics of Color Video Signals. 



Video Test Signals 

Many industry-standard video test signals 
have been defined to help test the relative qual- 
ity of encoding, decoding, and the transmis- 
sion path, and to perform calibration. Note that 
some video test signals cannot properly be 
generated by providing RGB data to an 
encoder; in this case, YCbCr data may be used. 

If the video standard uses a 7.5-IRE setup, 
typically only test signals used for visual exam- 
ination use the 7.5-IRE setup. Test signals 
designed for measurement purposes typically 
use a O-IRE setup, providing the advantage of 
defining a known blanking level. 

Color Bars Overview 



most common color bar formats. Color bars 
have two major characteristics: amplitude and 
saturation. 

The amplitude of a color bar signal is 
determined by: 

amplitude (%) = ,n —^‘ ^ B) a x 100 
max (R, G, B) b 

where max(R,G,B) a is the maximum value of 
R G B' during colored bars and max(R,G,B)b 
is the maximum value of RGB' during refer- 
ence white. 

The saturation of a color bar signal is less 
than 100% if the minimum value of any one of 
the RGB' components is not zero. The satura- 
tion is determined by: 



Color bars are one of the standard video 
test signals, and there are several variations, 
depending on the video standard and applica- 
tion. For this reason, this section reviews the 



saturation (%) = 



1 - 



min(R, G, B) ~y 
max(R,G,By 



x 100 
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where min(R,G,B) and max(R,G,B) are the 
minimum and maximum values, respectively, 
of RGB' during color bars, and y is the 
gamma exponent, typically [1/0.45]. 

NTSC Color Bars 

In 1953, it was normal practice for the ana- 
log R G B' signals to have a 7.5 IRE setup, and 
the original NTSC equations assumed this 
form of input to an encoder. Today, digital 
RGB' or YCbCr signals typically do not 
include the 7.5 IRE setup, and the 7.5 IRE 
setup is added within the encoder. 

The different color bar signals are 
described by four amplitudes, expressed in 
percent, separated by oblique strokes. 100% 
saturation is implied, so saturation is not speci- 
fied. The first and second numbers are the 
white and black amplitudes, respectively. The 
third and fourth numbers are the white and 
black amplitudes from which the color bars are 
derived. 

For example, 100/7.5/75/7.5 color bars 
would be 75% color bars with 7.5% setup in 
which the white bar has been set to 100% and 
the black bar to 7.5%. Since NTSC systems usu- 
ally have the 7.5% setup, the two common color 
bars are 75/7.5/75/7.5 and 100/7.5/100/7.5, 
which are usually shortened to 75% and 100%, 
respectively. The 75% bars are most commonly 
used. Television transmitters do not pass infor- 
mation with an amplitude greater than about 
120 IRE. Therefore, the 75% color bars are 
used for transmission testing. The 100% color 
bars may be used for testing in situations 
where a direct connection between equipment 
is possible. The 75/7.5/75/7.5 color bars are a 
part of the Electronic Industries Association 
EIA-189-A Encoded Color Bar Standard. 

Figure 8.27 shows a typical vectorscope 
display for full-screen 75% NTSC color bars. 
Figure 8.28 illustrates the video waveform for 
75% color bars. 



Tables 8.10 and 8.11 list the luminance and 
chrominance levels for the two common color 
bar formats for NTSC. 

For reference, the RGB and YCbCr values 
to generate the standard NTSC color bars are 
shown in Tables 8.12 and 8.13. RGB is 
assumed to have a range of 0-255; YCbCr is 
assumed to have a range of 16-235 for Y and 
16-240 for Cb and Cr. It is assumed any 7.5 IRE 
setup is implemented within the encoder. 

PAL Color Bars 

Unlike NTSC, PAL does not support a 7.5 
IRE setup; the black and blank levels are the 
same. The different color bar signals are usu- 
ally described by four amplitudes, expressed in 
percent, separated by oblique strokes. The 
first and second numbers are the maximum 
and minimum percentages, respectively, of 
R G B' values for an uncolored bar. The third 
and fourth numbers are the maximum and 
minimum percentages, respectively, of R G B' 
values for a colored bar. 

Since PAL systems have a 0% setup, the 
two common color bars are 100/0/75/0 and 
100/0/100/0, which are usually shortened to 
75% and 100%, respectively. The 75% color bars 
are used for transmission testing. The 100% 
color bars may be used for testing in situations 
where a direct connection between equipment 
is possible. 

The 100/0/75/0 color bars also are 
referred to as EBU (European Broadcast 
Union) color bars. All of the color bars dis- 
cussed in this section are also a part of Specifi- 
cation of Television Standards for 625-line 
System-I Transmissions (1971) published by the 
Independent Television Authority (ITA) and 
the British Broadcasting Corporation (BBC), 
and ITU-R BT.471. 
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VM700A Video Measurement Set 




System Line L211 FI 
Angle (deg) 0.0 
Gain x 1.000 

0.000 dB 
525 line NTSC 
Burst from source 



Figure 8.27. Typical Vectorscope Display for 75% NTSC Color Bars. 
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COLOR BURST 
(9 CYCLES) 



~ nz 



WHITE LEVEL (100 IRE) 



BLACK LEVEL (7.5 IRE) 
BLANK LEVEL (0 IRE) 



40 IRE 



SYNC LEVEL (-40 IRE) 



Figure 8.28. IRE Values for 75% NTSC Color Bars. 





Luminance 

(IRE) 


Chrominance 

Level 

(IRE) 


Minimum 

Chrominance 

Excursion 

(IRE) 


Maximum 

Chrominance 

Excursion 

(IRE) 


Chrominance 

Phase 

(degrees) 


white 


76.9 


0 


- 


- 


- 


yellow 


69.0 


62.1 


37.9 


100.0 


167.1 


cyan 


56.1 


87.7 


12.3 


100.0 


283.5 


green 


48.2 


81.9 


7.3 


89.2 


240.7 


magenta 


36.2 


81.9 


-4.8 


77.1 


60.7 


red 


28.2 


87.7 


-15.6 


72.1 


103.5 


blue 


15.4 


62.1 


-15.6 


46.4 


347.1 


black 


7.5 


0 


- 


- 


- 



Table 8.10. 75/7.5/75/7.5 (75%) NTSC Color Bars 
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Luminance 

(IRE) 


Chrominance 

Level 

(IRE) 


Minimum 

Chrominance 

Excursion 

(IRE) 


Maximum 

Chrominance 

Excursion 

(IRE) 


Chrominance 

Phase 

(degrees) 


white 


100.0 


0 


- 


- 


- 


yellow 


89.5 


82.8 


48.1 


130.8 


167.1 


cyan 


72.3 


117.0 


13.9 


130.8 


283.5 


green 


61.8 


109.2 


7.2 


116.4 


240.7 


magenta 


45.7 


109.2 


-8.9 


100.3 


60.7 


red 


35.2 


117.0 


-23.3 


93.6 


103.5 


blue 


18.0 


82.8 


-23.3 


59.4 


347.1 


black 


7.5 


0 


- 


- 


- 



Table 8.11. 100/7.5/100/7.5 (100%) NTSC Color Bars. 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


gamma-corrected RGB (gamma = 1/0.45) 


R' 


191 


191 


0 


0 


191 


191 


0 


0 


G' 


191 


191 


191 


191 


0 


0 


0 


0 


B' 


191 


0 


191 


0 


191 


0 


191 


0 


linear RGB 


R 


135 


135 


0 


0 


135 


135 


0 


0 


G 


135 


135 


135 


135 


0 


0 


0 


0 


B 


135 


0 


135 


0 


135 


0 


135 


0 


YCbCr 


Y 


180 


162 


131 


112 


84 


65 


35 


16 


Cb 


128 


44 


156 


72 


184 


100 


212 


128 


Cr 


128 


142 


44 


58 


198 


212 


114 


128 



Table 8.12. RGB and YCbCr Values for 75% NTSC Color Bars. 
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White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


gamma-corrected RGB (gamma = 1/0.45) 


R' 


255 


255 


0 


0 


255 


255 


0 


0 


G' 


255 


255 


255 


255 


0 


0 


0 


0 


B' 


255 


0 


255 


0 


255 


0 


255 


0 


linear RGB 


R 


255 


255 


0 


0 


255 


255 


0 


0 


G 


255 


255 


255 


255 


0 


0 


0 


0 


B 


255 


0 


255 


0 


255 


0 


255 


0 


YCbCr 


Y 


235 


210 


170 


145 


106 


81 


41 


16 


Cb 


128 


16 


166 


54 


202 


90 


240 


128 


Cr 


128 


146 


16 


34 


222 


240 


110 


128 



Table 8.13. RGB and YCbCr Values for 100% NTSC Color Bars. 



Figure 8.29 illustrates the video waveform 
for 75% color bars. Figure 8.30 shows a typical 
vector scope display for full-screen 75% PAL 
color bars. 

Tables 8.14, 8.15, and 8.16 list the lumi- 
nance and chrominance levels for the three 
common color bar formats for PAL. 

For reference, the RGB and YCbCr values 
to generate the standard PAL color bars are 
shown in Tables 8.17, 8.18, and 8.19. RGB is 
assumed to have a range of 0-255; YCbCr is 
assumed to have a range of 16-235 for Y and 
16-240 for Cb and Cr. 



EIA Color Bars (NTSC) 

The EIA color bars (Figure 8.28 and Table 
8.10) are a part of the EIA-189-A standard. The 
seven bars (gray, yellow, cyan, green, magenta, 
red, and blue) are at 75% amplitude, 100% satu- 
ration. The duration of each color bar is 1/7 of 
the active portion of the scan line. Note that 
the black bar in Figure 8.28 and Table 8.10 is 
not part of the standard and is shown for refer- 
ence only. The color bar test signal allows 
checking for hue and color saturation accu- 
racy. 
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Luminance 

(volts) 


Peak-to-Peak Chrominance 


Chrominance 

Phase 

(degrees) 


U axis 
(volts) 


V axis 
(volts) 


Total 

(volts) 


Line n 

(135° burst) 


Line n + 1 
(225° burst) 


white 


0.700 


0 


- 


- 


- 


- 


yellow 


0.465 


0.459 


0.105 


0.470 


167 


193 


cyan 


0.368 


0.155 


0.646 


0.664 


283.5 


76.5 


green 


0.308 


0.304 


0.541 


0.620 


240.5 


119.5 


magenta 


0.217 


0.304 


0.541 


0.620 


60.5 


299.5 


red 


0.157 


0.155 


0.646 


0.664 


103.5 


256.5 


blue 


0.060 


0.459 


0.105 


0.470 


347 


13.0 


black 


0 


0 


0 


0 


- 


- 



Table 8.14. 100/0/75/0 (75%) PAL Color Bars. 





Luminance 

(volts) 


Peak-to-Peak Chrominance 


Chrominance 

Phase 

(degrees) 


U axis 
(volts) 


V axis 
(volts) 


Total 

(volts) 


Line n 

(135° burst) 


Line n + 1 
(225° burst) 


white 


0.700 


0 


- 


- 


- 


- 


yellow 


0.620 


0.612 


0.140 


0.627 


167 


193 


cyan 


0.491 


0.206 


0.861 


0.885 


283.5 


76.5 


green 


0.411 


0.405 


0.721 


0.827 


240.5 


119.5 


magenta 


0.289 


0.405 


0.721 


0.827 


60.5 


299.5 


red 


0.209 


0.206 


0.861 


0.885 


103.5 


256.5 


blue 


0.080 


0.612 


0.140 


0.627 


347 


13.0 


black 


0 


0 


0 


0 


- 


- 



Table 8.15. 100/0/100/0 (100%) PAL Color Bars 






Video Test Signals 319 





Luminance 

(volts) 


Peak-to-Peak Chrominance 


Chrominance 

Phase 

(degrees) 


U axis 
(volts) 


V axis 
(volts) 


Total 

(volts) 


Line n 

(135° burst) 


Line n + 1 
(225° burst) 


white 


0.700 


0 


- 


- 


- 


- 


yellow 


0.640 


0.459 


0.105 


0.470 


167 


193 


cyan 


0.543 


0.155 


0.646 


0.664 


283.5 


76.5 


green 


0.483 


0.304 


0.541 


0.620 


240.5 


119.5 


magenta 


0.392 


0.304 


0.541 


0.620 


60.5 


299.5 


red 


0.332 


0.155 


0.646 


0.664 


103.5 


256.5 


blue 


0.235 


0.459 


0.105 


0.470 


347 


13.0 


black 


0 


0 


0 


0 


- 


- 



Table 8.16. 100/0/100/25 (98%) PAL Color Bars. 



4.43 MHZ 
COLOR BURST 
(10 CYCLES) 



WHITE LEVEL (100 IRE) 



BLACK /BLANK LEVEL (0 IRE) 



SYNC LEVEL (-43 IRE) 



Figure 8.29. IRE Values for 75% PAL Color Bars 
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VM700A Video Measurement Set 



Noise reduction: 12.30db 



V 



System Line L483 




Figure 8.30. Typical Vectorscope Display for 75% PAL Color Bars. 
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White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


gamma-corrected RGB (gamma = 1/0.45) 


R' 


255 


191 


0 


0 


191 


191 


0 


0 


G' 


255 


191 


191 


191 


0 


0 


0 


0 


B' 


255 


0 


191 


0 


191 


0 


191 


0 


linear RGB 


R 


255 


135 


0 


0 


135 


135 


0 


0 


G 


255 


135 


135 


135 


0 


0 


0 


0 


B 


255 


0 


135 


0 


135 


0 


135 


0 


YCbCr 


Y 


235 


162 


131 


112 


84 


65 


35 


16 




128 


44 


156 


72 


184 


100 


212 


128 




128 


142 


44 


58 


198 


212 


114 


128 



Table 8.17. RGB and YCbCr Values for 75% PAL Color Bars. 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


gamma-corrected RGB (gamma = 1/0.45) 


R' 


255 


255 


0 


0 


255 


255 


0 


0 


G' 


255 


255 


255 


255 


0 


0 


0 


0 


B' 


255 


0 


255 


0 


255 


0 


255 


0 


linear RGB 


R 


255 


255 


0 


0 


255 


255 


0 


0 


G 


255 


255 


255 


255 


0 


0 


0 


0 


B 


255 


0 


255 


0 


255 


0 


255 


0 


YCbCr 


Y 


235 


210 


170 


145 


106 


81 


41 


16 




128 


16 


166 


54 


202 


90 


240 


128 




128 


146 


16 


34 


222 


240 


110 


128 



Table 8.18. RGB and YCbCr Values for 100% PAL Color Bars. 








322 Chapter 8: NTSC, PAL, and SECAM Overview 





White 


Yellow 


Cyan 


Green 


Magenta 


Red 


Blue 


Black 


gamma-corrected RGB (gamma = 1/0.45) 


R' 


255 


255 


44 


44 


255 


255 


44 


44 


G' 


255 


255 


255 


255 


44 


44 


44 


44 


B' 


255 


44 


255 


44 


255 


44 


255 


44 


linear RGB 


R 


255 


255 


5 


5 


255 


255 


5 


5 


G 


255 


255 


255 


255 


5 


5 


5 


5 


B 


255 


5 


255 


5 


255 


5 


255 


5 


YCbCr 


Y 


235 


216 


186 


167 


139 


120 


90 


16 


Cb 


128 


44 


156 


72 


184 


100 


212 


128 


Cr 


128 


142 


44 


58 


198 


212 


114 


128 



Table 8.19. RGB and YCbCr Values for 98% PAL Color Bars. 



EBU Color Bars (PAL) 

The EBU color bars are similar to the EIA 
color bars, except a 100 IRE white level is used 
(see Figure 8.29 and Table 8.14). The six col- 
ored bars (yellow, cyan, green, magenta, red, 
and blue) are at 75% amplitude, 100% satura- 
tion, while the white bar is at 100% amplitude. 
The duration of each color bar is 1/7 of the 
active portion of the scan line. Note that the 
black bar in Figure 8.29 and Table 8.14 is not 
part of the standard and is shown for reference 
only. The color bar test signal allows checking 
for hue and color saturation accuracy. 

SMPTE Bars (NTSC) 

This split-field test signal is composed of 
the EIA color bars for the first 2/3 of the field, 
the reverse blue bars for the next 1/12 of the 



field, and the PLUGE test signal for the 
remainder of the field. 

Reverse Blue Bars 

The reverse blue bars are composed of the 
blue, magenta, and cyan colors bars from the 
EIA/EBU color bars, but are arranged in a dif- 
ferent order — blue, black, magenta, black, 
cyan, black, and white. The duration of each 
color bar is 1/7 of the active portion of the scan 
line. Typically, reverse blue bars are used with 
the EIA or EBU color bar signal in a split-field 
arrangement, with the EIA/EBU color bars 
comprising the first 3/4 of the field and the 
reverse blue bars comprising the remainder of 
the field. This split-field arrangement eases 
adjustment of chrominance and hue on a color 
monitor. 
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PLUGE 

PLUGE (Picture Line-Up Generating 
Equipment) is a visual black reference, with 
one area blacker-than-black, one area at black, 
and one area lighter-than-black. The bright- 
ness of the monitor is adjusted so that the 
black and blacker-than-black areas are indistin- 
guishable from each other and the lighter- 
than-black area is slightly lighter (the contrast 
should be at the normal setting). Additional 
test signals, such as a white pulse and modu- 
lated IQ signals, are usually added to facilitate 
testing and monitor alignment. 



The NTSC PLUGE test signal (shown in 
Figure 8.31) is composed of a 7.5 IRE (black 
level) pedestal with a 40 IRE “-I” phase modu- 
lation, a 100 IRE white pulse, a 40 IRE “+Q” 
phase modulation, and 3.5 IRE, 7.5 IRE, and 
11.5 IRE pedestals. Typically, PLUGE is used 
as part of the SMPTE bars. 

For PAL, each country has its own slightly 
different PLUGE configuration, with most dif- 
ferences being in the black pedestal level used, 
and work is being done on a standard test sig- 
nal. Figure 8.32 illustrates a typical PAL 
PLUGE test signal. Usually used as a full- 
screen test signal, it is composed of a 0 IRE 
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£ 



+ 100 




BLACK LEVEL (7.5 IRE) 
BLANK LEVEL (0 IRE) 



SYNC LEVEL (-40 IRE) 



Figure 8.31. PLUGE Test Signal for NTSC. IRE values are indicated. 
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pedestal with PLUGE (-2 IRE, 0 IRE, and 2 
IRE pedestals) and a white pulse. The white 
pulse may have five levels of brightness (0, 25, 
50, 75, and 100 IRE), depending on the scan 
line number, as shown in Figure 8.32. The 
PLUGE is displayed on scan lines that have 
non-zero IRE white pulses. ITU-R BT.1221 dis- 
cusses considerations for various PAL sys- 
tems. 



Y Bars 

The Y bars consist of the luminance-only 
levels of the EIA/EBU color bars; however, the 
black level (7.5 IRE for NTSC and 0 IRE for 
PAL) is included and the color burst is still 
present. The duration of each luminance bar is 
therefore 1/8 of the active portion of the scan 
line. Y bars are useful for color monitor adjust- 
ment and measuring luminance nonlinearity. 
Typically, the Y bars signal is used with the 
EIA or EBU color bar signal in a split-field 
arrangement, with the EIA/EBU color bars 
comprising the first 3/ 4 of the field and the Y 
bars signal comprising the remainder of the 
field. 



-i 3 m 

yj 5. cd 



4.43 MHZ 
COLOR BURST 
(10 CYCLES) 



BLANK / BLACK LEVEL (0 IRE) 



SYNC LEVEL (-43 IRE) 



MICROSECONDS = 22.5 24.8 27.1 29.4 



Figure 8.32. PLUGE Test Signal for PAL. IRE values are indicated. 
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Red Field 

The red field signal consists of a 75% ampli- 
tude, 100% saturation red chrominance signal. 
This is useful as the human eye is sensitive to 
static noise intermixed in a red field. Distor- 
tions that cause small errors in picture quality 
can be examined visually for the effect on the 
picture. Typically, the red field signal is used 
with the EIA/EBU color bars signal in a split- 
field arrangement, with the EIA/EBU color 
bars comprising the first 3/4 of the field, and 
the red field signal comprising the remainder 
of the field. 

10-Step Staircase 

This test signal is composed of ten unmod- 
ulated luminance steps of 10 IRE each, ranging 
from 0 IRE to 100 IRE, shown in Figure 8.33. 
This test signal may be used to measure lumi- 
nance nonlinearity. 



Modulated Ramp 

The modulated ramp test signal, shown in 
Figure 8.34, is composed of a luminance ramp 
from 0 IRE to either 80 or 100 IRE, superim- 
posed with modulated chrominance that has a 
phase of 0° +1° relative to the burst. The 80 
IRE ramp provides testing of the normal oper- 
ating range of the system; a 100 IRE ramp may 
optionally be used to test the entire operating 
range. The peak-to-peak modulated chromi- 
nance is 40 +0.5 IRE for (M) NTSC and 42.86 
+0.5 IRE for (B, D, G, H, I) PAL. Note a 0 IRE 
setup is used. The rise and fall times at the 
start and end of the modulated ramp envelope 
are 400 +25 ns (NTSC systems) or approxi- 
mately 1 ps (PAL systems). This test signal 
may be used to measure differential gain. The 
modulated ramp signal is preferred over a 5- 
step or 10-step modulated staircase signal 
when testing digital systems. 



COLOR BURST 



IRE 

LEVELS 



MICROSECONDS = 17.5 21.5 25.5 29.5 33.5 37.5 41.5 45.5 49.5 53.5 61.8 



WHITE LEVEL (100 IRE) 



BLANK LEVEL (0 IRE) 



SYNC LEVEL 



Figure 8.33. Ten-Step Staircase Test Signal for NTSC and PAL. 
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Modulated Staircase 

The 5-step modulated staircase signal (a 
10-step version is also used) , shown in Figure 
8.35, consists of 5 luminance steps, superim- 
posed with modulated chrominance that has a 
phase of 0° ±1° relative to the burst. The peak- 
to-peak modulated chrominance amplitude is 
40 +0.5 IRE for (M) NTSC and 42.86 +0.5 IRE 
for (B, D, G, H, I) PAL. Note that a 0 IRE setup 
is used. The rise and fall times of each modula- 
tion packet envelope are 400 +25 ns (NTSC 
systems) or approximately 1 ps (PAL sys- 
tems) . The luminance IRE levels for the 5-step 
modulated staircase signal are shown in Fig- 
ure 8.35. This test signal may be used to mea- 
sure differential gain. The modulated ramp 
signal is preferred over a 5-step or 10-step 
modulated staircase signal when testing digital 
systems. 



Modulated Pedestal 

The modulated pedestal test signal (also 
called a three-level chrominance bar), shown 
in Figure 8.36, is composed of a 50 IRE lumi- 
nance pedestal, superimposed with three 
amplitudes of modulated chrominance that has 
a phase relative to the burst of -90° +1°. The 
peak-to-peak amplitudes of the modulated 
chrominance are 20 +0.5, 40 +0.5, and 80 +0.5 
IRE for (M) NTSC and 20 +0.5, 60 +0.5, and 
100 +0.5 IRE for (B, D, G, H, I) PAL. Note a 0 
IRE setup is used. The rise and fall times of 
each modulation packet envelope are 400 +25 
ns (NTSC systems) or approximately 1 ps 
(PAL systems) . This test signal may be used to 
measure chrominance-to-luminance intermod- 
ulation and chrominance nonlinear gain. 




Figure 8.34. 80 IRE Modulated Ramp Test Signal for NTSC and PAL. 




Video Test Signals 327 



COLOR BURST 

\ 




36 



0 



18 





90 




72 















BLANK LEVEL (0 IRE) 



SYNC LEVEL 

Figure 8.35. Five-Step Modulated Staircase Test Signal for NTSC and PAL. 



±40 IRE 
(± 50) 



±10 IRE 
(±10) 



±20 IRE 
(± 30) 



COLOR BURST 



BLANK LEVEL (0 IRE) 



SYNC LEVEL 



MICROSECONDS = 10.0 17.9 29.8 41.7 53.6 61.6 



Figure 8.36. Modulated Pedestal Test Signal for NTSC and PAL. 
PAL IRE values are shown in parentheses. 
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Multiburst 

The multiburst test signal for (M) NTSC, 
shown in Figure 8.37, consists of a white flag 
with a peak amplitude of 100 +1 IRE and six 
frequency packets, each a specific frequency. 
The packets have a 40 +1 IRE pedestal with 
peak-to-peak amplitudes of 60 +0.5 IRE. Note a 
0 IRE setup is used and the starting and end- 
ing point of each packet is at zero phase. 

The ITU multiburst test signal for (B, D, G, 
H, I) PAL, shown in Figure 8.38, consists of a 4 
ps white flag with a peak amplitude of 80 +1 
IRE and six frequency packets, each a specific 
frequency. The packets have a 50 ±1 IRE ped- 
estal with peak-to-peak amplitudes of 60 +0.5 
IRE. Note the starting and ending points of 
each packet are at zero phase. The gaps 
between packets are 0.4-2.0 ps. The ITU multi- 
burst test signal may be present on line 18. 

The multiburst signals are used to test the 
frequency response of the system by measur- 
ing the peak-to-peak amplitudes of the packets. 



Line Bar 

The line bar is a single 100 +0.5 IRE (refer- 
ence white) pulse of 10 ps (PAL), 18 ps 
(NTSC) , or 25 ps (PAL) that occurs anywhere 
within the active scan line time (rise and fall 
times are < 1 ps). Note that the color burst is 
not present, and a 0 IRE setup is used. This 
test signal is used to measure line time distor- 
tion (line tilt or H tilt). A digital encoder or 
decoder does not generate line time distortion; 
the distortion is generated primarily by the 
analog filters and transmission channel. 

Multipulse 

The (M) NTSC multipulse contains a 2T 
pulse and 25T and 12. 5T pulses with various 
high-frequency components, as shown in Fig- 
ure 8.39. The (B, D, G, H, I) PAL multipulse is 
similar, except 20T and 10T pulses are used, 
and there is no 7.5 IRE setup. This test signal is 
typically used to measure the frequency 
response of the transmission channel. 
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Figure 8.37. Multiburst Test Signal for NTSC. 
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Figure 8.38. ITU Multiburst Test Signal for PAL. 
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Figure 8.39. Multipulse Test Signal for NTSC and PAL. PAL values are shown in parentheses. 
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Field Square Wave 

The field square wave contains 100 +0.5 
IRE pulses for the entire active line time for 
Field 1 and blanked scan lines for Field 2. Note 
that the color burst is not present and a 0 IRE 
setup is used. This test signal is used to mea- 
sure field time distortion (field tilt or V tilt) . A 
digital encoder or decoder does not generate 
field time distortion; the distortion is gener- 
ated primarily by the analog filters and trans- 
mission channel. 



Composite Test Signal 

NTC-7 Version for NTSC 

The NTC (U. S. Network Transmission 
Committee) has developed a composite test 
signal that may be used to test several video 
parameters, rather than using multiple test 
signals. The NTC-7 composite test signal for 
NTSC systems (shown in Figure 8.40) con- 
sists of a 100 IRE line bar, a 2T pulse, a 12. 5T 
chrominance pulse, and a 5-step modulated 
staircase signal. 
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Figure 8.40. NTC-7 Composite Test Signal for NTSC, with Corresponding IRE Values. 
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The line bar has a peak amplitude of 100 
+0.5 IRE, and 10-90% rise and fall times of 125 
+5 ns with an integrated sine-squared shape. It 
has a width at the 60 IRE level of 18 ps. 

The 2T pulse has a peak amplitude of 100 
+0.5 IRE, with a half-amplitude width of 250 
+10 ns. 

The 12.5T chrominance pulse has a peak 
amplitude of 100 +0.5 IRE, with a half-ampli- 
tude width of 1562.5 +50 ns. 

The 5-step modulated staircase signal con- 
sists of 5 luminance steps superimposed with a 
40 +0.5 IRE subcarrier that has a phase of 0° 
+1° relative to the burst. The rise and fall times 
of each modulation packet envelope are 400 
+25 ns. 

The NTC-7 composite test signal may be 
present on line 17. 



ITU Version for PAL 

The ITU (BT.628 and BT.473) has devel- 
oped a composite test signal that may be used 
to test several video parameters, rather than 
using multiple test signals. The ITU composite 
test signal for PAL systems (shown in Figure 
8.41) consists of a white flag, a 2T pulse, and a 
5-step modulated staircase signal. 

The white flag has a peak amplitude of 100 
+1 IRE and a width of 10 ps. 

The 2T pulse has a peak amplitude of 100 
+0.5 IRE, with a half-amplitude width of 200 
+10 ns. 

The 5-step modulated staircase signal con- 
sists of 5 luminance steps (whose IRE values 
are shown in Figure 8.41) superimposed with a 
42.86 +0.5 IRE subcarrier that has a phase of 
60° +1° relative to the U axis. The rise and fall 
times of each modulation packet envelope are 
approximately 1 ps. 




BLANK LEVEL (0 IRE) 



SYNC LEVEL (-43 IRE) 



Figure 8.41. ITU Composite Test Signal for PAL, with Corresponding IRE Values. 
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The ITU composite test signal may be 
present on line 330. 

U.K. Version 

The United Kingdom allows the use of a 
slightly different test signal since the 10T 
pulse is more sensitive to delay errors than the 
20T pulse (at the expense of occupying less 
chrominance bandwidth). Selection of an 
appropriate pulse width is a trade-off between 
occupying the PAL chrominance bandwidth as 
fully as possible and obtaining a pulse with suf- 
ficient sensitivity to delay errors. Thus, the 
national test signal (developed by the British 
Broadcasting Corporation and the Indepen- 
dent Television Authority) in Figure 8.42 may 
be present on lines 19 and 332 for (I) PAL sys- 
tems in the United Kingdom. 



The white flag has a peak amplitude of 100 
+1 IRE and a width of 10 ps. 

The 2T pulse has a peak amplitude of 100 
+0.5 IRE, with a half-amplitude width of 200 
+10 ns. 

The 10T chrominance pulse has a peak 
amplitude of 100 +0.5 IRE. 

The 5-step modulated staircase signal con- 
sists of 5 luminance steps (whose IRE values 
are shown in Figure 8.42) superimposed with a 
21.43 +0.5 IRE subcarrier that has a phase of 
60° +1° relative to the U axis. The rise and fall 
times of each modulation packet envelope are 
approximately 1 ps. 




BLANK LEVEL (0 IRE) 



SYNC LEVEL (-43 IRE) 



Figure 8.42. United Kingdom (I) PAL National Test Signal #1, 
with Corresponding IRE Values. 
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Combination Test Signal 

NTC-7 Version for NTSC 

The NTC (U.S. Network Transmission 
Committee) has also developed a combination 
test signal that may be used to test several 
video parameters, rather than using multiple 
test signals. The NTC-7 combination test sig- 
nal for NTSC systems (shown in Figure 8.43) 
consists of a white flag, a multiburst, and a 
modulated pedestal signal. 

The white flag has a peak amplitude of 100 
+1 IRE and a width of 4 ps. 

The multiburst has a 50 ±1 IRE pedestal 
with peak-to-peak amplitudes of 50 +0.5 IRE. 
The starting point of each frequency packet is 
at zero phase. The width of the 0.5 MHz packet 
is 5 ps; the width of the remaining packets is 3 
ps. 



The 3-step modulated pedestal is com- 
posed of a 50 IRE luminance pedestal, superim- 
posed with three amplitudes of modulated 
chrominance (20 +0.5, 40 +0.5, and 80 +0.5 IRE 
peak-to-peak) that have a phase of -90° ±1° rel- 
ative to the burst. The rise and fall times of 
each modulation packet envelope are 400 +25 
ns. 

The NTC-7 combination test signal may be 
present on line 280. 

ITU Version for PAL 

The ITU (BT.473) has developed a combi- 
nation test signal that may be used to test sev- 
eral video parameters, rather than using 
multiple test signals. The ITU combination test 
signal for PAL systems (shown in Figure 8.44) 
consists of a white flag, a 2T pulse, a 20T mod- 
ulated chrominance pulse, and a 5-step lumi- 
nance staircase signal. 
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Figure 8.43. NTC-7 Combination Test Signal for NTSC. 
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The line bar has a peak amplitude of 100 +1 
IRE and a width of 10 ps. 

The 2T pulse has a peak amplitude of 100 
+0.5 IRE, with a half-amplitude width of 200 
+10 ns. 

The 20T chrominance pulse has a peak 
amplitude of 100 +0.5 IRE, with a half-ampli- 
tude width of 2.0 +0.06 ps. 

The 5-step luminance staircase signal con- 
sists of 5 luminance steps, at 20, 40, 60, 80, and 
100 +0.5 IRE. 

The ITU combination test signal may be 
present on line 17. 

ITU ITS Version for PAL 

The ITU (BT.473) has developed a combi- 
nation ITS (insertion test signal) that may be 
used to test several PAL video parameters, 
rather than using multiple test signals. The 
ITU combination ITS for PAL systems (shown 
in Figure 8.45) consists of a 3-step modulated 
pedestal with peak-to-peak amplitudes of 20, 



60, and 100 +1 IRE, and an extended subcar- 
rier packet with a peak-to-peak amplitude of 60 
+1 IRE. The rise and fall times of each subcar- 
rier packet envelope are approximately 1 ps. 
The phase of each subcarrier packet is 60° 
+1° relative to the U axis. The tolerance on the 
50 IRE level is +1 IRE. 

The ITU composite ITS may be present on 
line 331. 

U.K. Version 

The United Kingdom allows the use of a 
slightly different test signal, as shown in Fig- 
ure 8.46. It may be present on lines 20 and 333 
for (I) PAL systems in the United Kingdom. 

The test signal consists of a 50 IRE lumi- 
nance bar, part of which has a 100 IRE subcar- 
rier superimposed that has a phase of 60° 
+1° relative to the U axis, and an extended 
burst of subcarrier on the second half of the 
scan line. 
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Figure 8.44. ITU Combination Test Signal for PAL. 




Video Test Signals 335 



100 

IRE 



4.43 MHZ 
COLOR BURST 
(10 CYCLES) 



80 IRE 



50 IRE 



20 IRE 



BLANK LEVEL (0 IRE) 



SYNC LEVEL (-43 IRE) 



MICROSECONDS = 12 14 18 22 28 34 



60 



Figure 8.45. ITU Combination ITS Test Signal for PAL. 
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Figure 8.46. United Kingdom (I) PAL National Test Signal #2 
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T Pulse 

Square waves with fast rise times cannot 
be used for testing video systems, since attenu- 
ation and phase shift of out-of-band compo- 
nents cause ringing in the output signal, 
obscuring the in-band distortions being mea- 
sured. T, or sin 2 , pulses are bandwidth-limited, 
so are used for testing video systems. 

The 2T pulse is shown in Figure 8.47 and, 
like the T pulse, is obtained mathematically by 
squaring a half-cycle of a sine wave. T pulses 
are specified in terms of half amplitude dura- 
tion (HAD) , which is the pulse width measured 
at 50% of the pulse amplitude. Pulses with 
HADs that are multiples of the time interval T 
are used to test video systems. As seen in Fig- 
ures 8.39 through 8.44, T, 2T, 12. 5T, and 25T 
pulses are common when testing NTSC video 




(A) 



systems, whereas T, 2T, 10T, and 20T pulses 
are common for PAL video systems. 

T is the Nyquist interval or 

1/2F C 

where F c is the cutoff frequency of the video 
system. For NTSC, F c is 4 MHz, whereas F c 
for PAL systems is 5 MHz. Therefore, T for 
NTSC systems is 125 ns and for PAL systems it 
is 100 ns. For a T pulse with a HAD of 125 ns, a 
2T pulse has a HAD of 250 ns, and so on. The 
frequency spectra for the 2T pulse are shown 
in Figure 8.47 and is representative of the 
energy content in a typical character generator 
waveform. 

To generate smooth rising and falling 
edges of most video signals, a T step (gener- 
ated by integrating a T pulse) is typically used. 




(B) 




(C) 



Figure 8.47. The T Pulse, (a) 2T pulse, (b) 2T step, (c) Frequency spectra of the 2T pulse. 
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T steps have 10-90% rise/fall times of 0.964T 
and a well-defined bandwidth. The 2T step gen- 
erated from a 2T pulse is shown in Figure 8.47. 

The 12. 5T chrominance pulse, illustrated 
in Figure 8.48, is a good test signal to measure 
any chrominance-to-luminance timing error 
since its energy spectral distribution is 
bunched in two relatively narrow bands. Using 
this signal detects differences in the luminance 
and chrominance phase distortion, but not 
between other frequency groups. 



VBI Data 

VBI (vertical blanking interval) data may 
be inserted up to about live scan lines into the 
active picture region to ensure it won't be 





deleted by equipment replacing the VBI, by 
DSS MPEG which deletes the VBI, or by cable 
systems inserting their own VBI data. This is 
common practice by Neilsen and others to 
ensure their programming and commercial 
tracking data gets through the distribution sys- 
tems to the receivers. In most cases, this will 
be unseen since it is masked by the TVs over- 
scan. 

Timecode 

Two types of timecoding are commonly 
used, as defined by ANSI/SMPTE 12M and 
IEC 461: longitudinal timecode (LTC) and ver- 
tical interval timecode (VITC) . 

The LTC is recorded on a separate audio 
track; as a result, the analog VCR must use 




(C) 



Figure 8.48. The 12.5T Chrominance Pulse, (a) Luma component, 
(b) Chroma component, (c) Addition of (a) and (b). 
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high-bandwidth amplifiers and audio heads. 
This is due to the timecode frequency increas- 
ing as tape speed increases, until the point that 
the frequency response of the system results 
in a distorted timecode signal that may not be 
read reliably. At slower tape speeds, the time- 
code frequency decreases, until at very low 
tape speeds or still pictures, the timecode 
information is no longer recoverable. 

The VITC is recorded as part of the video 
signal; as a result, the timecode information is 
always available, regardless of the tape speed. 
However, the LTC allows the timecode signal 
to be written without writing a video signal; the 
VITC requires the video signal to be changed if 
a change in timecode information is required. 
The LTC therefore is useful for synchronizing 
multiple audio or audio/video sources. 

Frame Dropping 

If the field rate is 60/1.001 fields per sec- 
ond, straight counting at 60 fields per second 
yields an error of about 108 frames for each 
hour of running time. This may be handled in 
one of three ways: 

Nondrop frame: During a continuous 

recording, each time count increases by 1 
frame. In this mode, the drop frame flag will 
be a “0.” 

Drop frame: To minimize the timing error, 
the first two frame numbers (00 and 01) at 
the start of each minute, except for minutes 
00, 10, 20, 30, 40, and 50, are omitted from the 
count. In this mode, the drop frame flag will 
be a “1.” 

Drop frame for (M) PAL: To minimize the 
timing error, the first four frame numbers (00 
to 03) at the start of every second minute 
(even minute numbers) are omitted from the 
count, except for minutes 00, 20, and 40. In 
this mode, the drop frame flag will be a “1.” 



Even with drop framing, there is a long- 
term error of about 2.26 frames per 24 hours. 
This error accumulation is the reason time- 
code generators must be periodically reset if 
they are to maintain any correlation to the cor- 
rect time-of-day. Typically, this “reset-to-real- 
time” is referred to as a “jam sync” procedure. 
Some jam sync implementations reset the 
timecode to 00:00:00.00 and, therefore, must 
occur at midnight; others allow a true re-sync 
to the correct time-of-day. 

One inherent problem with jam sync cor- 
rection is the interruption of the timecode. 
Although this discontinuity may be brief, it 
may cause timecode readers to hiccup due to 
the interruption. 

Longitudinal Timecode (LTC) 

The LTC information is transferred using a 
separate serial interface, using the same elec- 
trical interface as the AES/EBU digital audio 
interface standard, and is recorded on a sepa- 
rate track. The basic structure of the time data 
is based on the BCD system. Tables 8.20 and 
8.21 list the LTC bit assignments and arrange- 
ment. Note that the 24-hour clock system is 
used. 

LTC Timing 

The modulation technique is such that a 
transition occurs at the beginning of every bit 
period. “1” is represented by a second transi- 
tion one-half a bit period from the start of the 
bit. “0” is represented when there is no transi- 
tion within the bit period (see Figure 8.49). 
The signal has a peak-to-peak amplitude of 0.5- 
4.5V, with rise and fall times of 40 +10 ps (10% 
to 90% amplitude points) . 

Because the entire frame time is used to 
generate the 80-bit LTC information, the bit- 
rate (in bits per second) is determined by: 
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Bit(s) 


Function 


Note 


Bit(s) 


Function 


Note 


0-3 


units of frames 




58 


flag 5 


note 5 


4-7 


user group 1 




59 


flag 6 


note 6 


8-9 


tens of frames 




60-63 


user group 8 




10 


flag 1 


note 1 


64 


sync bit 


fixed “0” 


11 


flag 2 


note 2 


65 


sync bit 


fixed “0” 


12-15 


user group 2 




66 


sync bit 


fixed “1" 


16-19 


units of seconds 




67 


sync bit 


fixed “1” 


20-23 


user group 3 




68 


sync bit 


fixed “1" 


24-26 


tens of seconds 




69 


sync bit 


fixed “1” 


27 


flag 3 


note 3 


70 


sync bit 


fixed “1” 


28-31 


user group 4 




71 


sync bit 


fixed “1" 


32-35 


units of minutes 




72 


sync bit 


fixed “1” 


36-39 


user group 5 




73 


sync bit 


fixed “1” 


40-42 


tens of minutes 




74 


sync bit 


fixed “1” 


43 


flag 4 


note 4 


75 


sync bit 


fixed “1” 


44-47 


user group 6 




76 


sync bit 


fixed “1" 


48-51 


units of hours 




77 


sync bit 


fixed “1” 


52-55 


user group 7 




78 


sync bit 


fixed “0” 


56-57 


tens of hours 




79 


sync bit 


fixed “1” 



Notes: 

1. Drop frame flag. 525-line and 1125-line systems: “1” if frame numbers are being dropped, “0” if no 
frame dropping is done. 625-line systems: “0.” 

2. Color frame flag. 525-line systems: “1” if even units of frame numbers identify fields 1 and 2 and odd 
units of field numbers identify fields 3 and 4. 625-line systems: “1” if timecode is locked to the video 
signal in accordance with 8-field sequence and the video signal has the “preferred subearrier-to-line- 
sync phase.” 1125-line systems: “0.” 

3. 525-line and 1125-line systems: Phase correction. This bit shall be put in a state so that every 80-bit 
word contains an even number of “0”s. 625-line systems: Binary group flag 0. 

4. 525-line and 1125-line systems: Binary group flag 0. 625-line systems: Binary group flag 2. 

5. Binary group flag 1. 

6. 525-line and 1125-line systems: Binary group flag 2. 625-line systems: Phase correction. This bit shall 
be put in a state so that every 80-bit word contains an even number of “0”s. 



Table 8.20. LTC Bit Assignments. 
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Frames (count 0-29 for 525-line and 1125-line systems, 0-24 for 625-line systems) 


units of frames (bits 0-3) 


4-bit BCD (count 0-9); bit 0 is LSB 


tens of frames (bits 8-9) 


2-bit BCD (count 0-2); bit 8 is LSB 



Seconds 


units of seconds (bits 16-19) 


4-bit BCD (count 0-9); bit 16 is LSB 


tens of seconds (bits 24-26) 


3-bit BCD (count 0-5); bit 24 is LSB 



Minutes 


units of minutes (bits 32-35) 


4-bit BCD (count 0-9); bit 32 is LSB 


tens of minutes (bits 40-42) 


3-bit BCD (count 0-5); bit 40 is LSB 



Hours 


units of hours (bits 48-51) 


4-bit BCD (count 0-9); bit 48 is LSB 


tens of hours (bits 56-57) 


2-bit BCD (count 0-2); bit 56 is LSB 



Table 8.21. LTC Bit Arrangement. 



Figure 8.49. LTC Data Bit Transition Format. 
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F c — 80 Fy 

where Fy is the vertical frame rate in frames 
per second. The 80 bits of timecode informa- 
tion are output serially, with bit 0 being first. 
The LTC word occupies the entire frame time, 
and the data must be evenly spaced through- 
out this time. The start of the LTC word occurs 
at the beginning of line 5 ±1.5 lines for 525-line 
systems, at the beginning of line 2+1.5 lines 
for 625-line systems, and at the vertical sync 
timing reference of the frame +1 line for 1125- 
line systems. 

Vertical Interval Timecode (VITC) 

The VITC is recorded during the vertical 
blanking interval of the video signal in both 
fields. Since it is recorded with the video, it can 
be read in still mode. However, it cannot be re- 
recorded (or restriped). Restriping requires 
dubbing down a generation, deleting, and 
inserting a new timecode. For YPbPr and S- 
video interfaces, VITC is present on the Y sig- 
nal. For analog RGB interfaces, VITC is 
present on all three signals. 

As with the LTC, the basic structure of the 
time data is based on the BCD system. Tables 
8.22 and 8.23 list the VITC bit assignments and 
arrangement. Note that the 24-hour clock sys- 
tem is used. 

VITC Cyclic Redundancy Check 

Eight bits (82-89) are reserved for the 
code word for error detection by means of 
cyclic redundancy checking. The generating 
polynomial, x 8 + 1, applies to all bits from 0 to 
81, inclusive. Figure 8.50 illustrates implement- 
ing the polynomial using a shift register. Dur- 
ing passage of timecode data, the multiplexer 
is in position 0 and the data is output while the 
CRC calculation is done simultaneously by the 
shift register. After all the timecode data has 
been output, the shift register contains the 



CRC value, and switching the multiplexer to 
position 1 enables the CRC value to be output. 
Repeating the process on decoding, the shift 
register contains all zeros if no errors exist. 

VITC Timing 

The modulation technique is such that 
each state corresponds to a binary state, and a 
transition occurs only when there is a change 
in the data between adjacent bits from a “1” to 
“0” or “0” to “1.” No transitions occur when 
adjacent bits contain the same data. This is 
commonly referred to as “non-return to zero” 
(NRZ). Synchronization bit pairs are inserted 
throughout the VITC data to assist the receiver 
in maintaining the correct frequency lock. 

The bit-rate (Fc) is defined to be: 

F c = 115 F h ± 2% 

where F^ is the horizontal line frequency. The 
90 bits of timecode information are output seri- 
ally, with bit 0 being first. For 625i (576i) sys- 
tems, lines 19 and 332 (or 21 and 334) are 
commonly used for the VITC. For 525i (480i) 
systems, lines 14 and 277 are commonly used. 
For 11251 (1080i) systems, lines 9 and 571 are 
commonly used. To protect the VITC against 
drop-outs, it may also be present two scan lines 
later, although any two nonconsecutive scan 
lines per field may be used. 

Figure 8.51 illustrates the timing of the 
VITC data on the scan line. The data must be 
evenly spaced throughout the VITC word. The 
10% to 90% rise and fall times of the VITC bit 
data should be 200 +50 ns (525-line and 625- 
line systems) or 100 +25 ns (1125-line systems) 
before adding it to the video signal to avoid 
possible distortion of the VITC signal by down- 
stream chrominance circuits. In most circum- 
stances, the analog lowpass filters after the 
video D/A converters should suffice for the fil- 
tering. 
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Bit(s) 


Function 


Note 


Bit(s) 


Function 


Note 


0 


sync bit 


fixed “1" 


42-45 


units of minutes 




1 


sync bit 


fixed “0" 


46-49 


user group 5 




2-5 


units of frames 




50 


sync bit 


fixed “1” 


6-9 


user group 1 




51 


sync bit 


fixed “0” 


10 


sync bit 


fixed “1" 


52-54 


tens of minutes 




11 


sync bit 


fixed “0" 


55 


flag 4 


note 4 


12-13 


tens of frames 




56-59 


user group 6 




14 


flag 1 


note 1 


60 


sync bit 


fixed “1” 


15 


flag 2 


note 2 


61 


sync bit 


fixed “0” 


16-19 


user group 2 




62-65 


units of hours 




20 


sync bit 


fixed “1” 


66-69 


user group 7 




21 


sync bit 


fixed “0" 


70 


sync bit 


fixed “1” 


22-25 


units of seconds 




71 


sync bit 


fixed “0” 


26-29 


user group 3 




72-73 


tens of hours 




30 


sync bit 


fixed “1” 


74 


flag 5 


note 5 


31 


sync bit 


fixed “0" 


75 


flag 6 


note 6 


32-34 


tens of seconds 




76-79 


user group 8 




35 


flag 3 


note 3 


80 


sync bit 


fixed “1” 


36-39 


user group 4 




81 


sync bit 


fixed “0” 


40 


sync bit 


fixed “1" 


82-89 


CRC group 




41 


sync bit 


fixed “0" 









Notes: 

1. Drop frame flag. 525-line and 1125-line systems: “1” if frame numbers are being dropped, “0” if no frame 
dropping is done. 625-line systems: “0.” 

2. Color frame flag. 525-line systems: “1” if even units of frame numbers identify fields 1 and 2 and odd 
units of field numbers identify fields 3 and 4. 625-line systems: “1” if timecode is locked to the video sig- 
nal in accordance with 8-field sequence and the video signal has the “preferred subcarrier-to-line-sync 
phase.” 1125-line systems: “0.” 

3. 525-line systems: Field flag. “0” during fields 1 and 3, “1” during fields 2 and 4. 625-line systems: Binary 
group flag 0. 1125-line systems: Field flag. “0” during field 1, “1” during field 2. 

4. 525-line and 1125-line systems: Binary group flag 0. 625-line systems: Binary group flag 2. 

5. Binary group flag 1. 

6. 525-line and 1125-line systems: Binary group flag 2. 625-line systems: Field flag. “0” during fields 1, 3, 5, 
and 7, “1” during fields 2, 4, 6, and 8. 



Table 8.22. VITC Bit Assignments. 
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Frames (count 0-29 for 525-line and 1125-line systems, 0-24 for 625-line systems) 


units of frames (bits 2-5) 


4-bit BCD (count 0-9) ; bit 2 is LSB 


tens of frames (bits 12-13) 


2-bit BCD (count 0-2); bit 12 is LSB 



Seconds 


units of seconds (bits 22-25) 


4-bit BCD (count 0-9); bit 22 is LSB 


tens of seconds (bits 32-34) 


3-bit BCD (count 0-5); bit 32 is LSB 



Minutes 


units of minutes (bits 42-45) 


4-bit BCD (count 0-9); bit 42 is LSB 


tens of minutes (bits 52-54) 


3-bit BCD (count 0-5); bit 52 is LSB 



Hours 


units of hours (bits 62-65) 


4-bit BCD (count 0-9) ; bit 62 is LSB 


tens of hours (bits 72-73) 


2-bit BCD (count 0-2); bit 72 is LSB 



Table 8.23. VITC Bit Arrangement. 



DATA 

IN 




Figure 8.50. VITC CRC Generation. 
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10nS MIN 



(19 BITS) 



80 ±10 
IRE 



63.556 nS (115 BITS) 



50.286 nS (90 BITS) 



2.1 nS MIN 



525/59.94 SYSTEMS 



1 1 .2 nS MIN 



(21 BITS) 



78 ±7 
IRE 



64 nS (115 BITS) 



49.655 nS (90 BITS) 



1.9 nS MIN 



625/50 SYSTEMS 



29.63 |iS (115 BITS) 



2.7 nS MIN 



23.18nS (90 BITS) 



1.5 |iS 



(10.5 BITS) 



78 ±7 
IRE 



MIN 



1125/59.94 SYSTEMS 



Figure 8.51. VITC Position and Timing, 
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User Bits Content 


Timecode Referenced 
to External Clock 


BGF2 


BGF1 


BGFO 


user defined 


no 


0 


0 


0 


8-bit character set 1 


no 


0 


0 


1 


user defined 


yes 


0 


1 


0 


reserved 


unassigned 


0 


1 


1 


date and time zone 3 


no 


1 


0 


0 


page / line 2 


no 


1 


0 


1 


date and time zone 3 


yes 


1 


1 


0 


page / line 2 


yes 


1 


1 


1 



Notes : 

1. Conforming to ISO/IEC 646 or 2022. 

2. Described in SMPTE 262M. 

3. Described in SMPTE 309M. See Tables 8.25 through 8.27. 



Table 8.24. LTC and VITC Binary Group Flag (BGF) Bit Definitions. 



1 2 

USER 3 4 

GROUPS 5 6 

7 8 



7 8 




7- BIT ISO: B1 B2 B3 B4 B5 B6 B7 0 

8- BIT ISO: A1 A2 A3 A4 A5 A6 A7 A8 

ONE ISO CHARACTER 



Figure 8.52. Use of Binary Groups to Describe 
ISO Characters Coded with 7 or 8 Bits. 
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User Group 8 


User Group 7 


Bit 3 


Bit 2 


Bit 1 


BitO 


Bit 3 


Bit 2 


Bit 1 


BitO 


MJD Flag 


0 


time zone offset code 0x00-0x3F 



Notes: 

1. MJD flag: “0” = YYMMDD format, “1” = MJD format. 



Table 8.25. Date and Time Zone Format Coding. 



User 

Group 


Assignment 


Value 


Description 


1 


D 


0-9 


day units 


2 


D 


0-3 


day units 


3 


M 


0-9 


month units 


4 


M 


0, 1 


month units 


5 


Y 


0-9 


year units 


6 


Y 


0-9 


year units 



Table 8.26. YYMMDD Date Format. 



User Bits 

The binary group flag (BGF) bits shown in 
Table 8.24 specify the content of the 32 user 
bits. The 32 user bits are organized as eight 
groups of four bits each. 

The user bits are intended for storage of 
data by users. The 32 bits may be assigned in 
any manner without restriction, if indicated as 
user-defined by the binary group flags. 

If an 8-bit character set conforming to 
ISO/IEC 646 or 2022 is indicated by the binary 
group flags, the characters are to be inserted 
as shown in Figure 8.52. Note that some user 
bits will be decoded before the binary group 
flags are decoded; therefore, the decoder must 
store the early user data before any processing 
is done. 



When the user groups are used to transfer 
time zone and date information, user groups 7 
and 8 specify the time zone and the format of 
the date in the remaining six user groups, as 
shown in Tables 8.25 and 8.27. The date may 
be either a six-digit YYMMDD format (Table 
8.26) or a six-digit modified Julian date (MJD), 
as indicated by the MJD flag. 

CEA-608 Closed Captioning 

This section reviews CEA-608 closed cap- 
tioning for the hearing impaired in the United 
States. Closed captioning and text are transmit- 
ted during the blanked active line-time portion 
of lines 21 and 284. However, due to video edit- 
ing they may occasionally reside on any line 
between 21-25 and 284-289. 
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Code 


Hours 


Code 


Hours 


Code 


Hours 


00 


UTC 


16 


UTC + 10.00 


2C 


UTC + 09.30 


01 


UTC - 01.00 


17 


UTC + 09.00 


2D 


UTC + 08.30 


02 


UTC - 02.00 


18 


UTC + 08.00 


2E 


UTC + 07.30 


03 


UTC - 03.00 


19 


UTC + 07.00 


2F 


UTC + 06.30 


04 


UTC - 04.00 


1A 


UTC - 06.30 


30 


TP-1 


05 


UTC - 05.00 


IB 


UTC - 07.30 


31 


TP-0 


06 


UTC - 06.00 


1C 


UTC - 08.30 


32 


UTC + 12.45 


07 


UTC - 07.00 


ID 


UTC - 09.30 


33 


reserved 


08 


UTC - 08.00 


IE 


UTC - 10.30 


34 


reserved 


09 


UTC - 09.00 


IF 


UTC -11.30 


35 


reserved 


0A 


UTC - 00.30 


20 


UTC + 06.00 


36 


reserved 


0B 


UTC - 01.30 


21 


UTC + 05.00 


37 


reserved 


OC 


UTC - 02.30 


22 


UTC + 04.00 


38 


user defined 


0D 


UTC - 03.30 


23 


UTC + 03.00 


39 


unknown 


0E 


UTC - 04.30 


24 


UTC + 02.00 


3A 


UTC + 05.30 


OF 


UTC - 05.30 


25 


UTC + 01.00 


3B 


UTC + 04.30 


10 


UTC - 10.00 


26 


reserved 


3C 


UTC + 03.30 


11 


UTC - 11.00 


27 


reserved 


3D 


UTC + 02.30 


12 


UTC - 12.00 


28 


TP-3 


3E 


UTC + 01.30 


13 


UTC + 13.00 


29 


TP-2 


3F 


UTC + 00.30 


14 


UTC + 12.00 


2A 


UTC + 11.30 




15 


UTC + 11.00 


2B 


UTC + 10.30 



Table 8.27. Time Zone Offset Codes 
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Extended data service (XDS) packets also 
may be transmitted during the blanked active 
line-time portion of line 284. XDS packets may 
indicate the program name, time into the show, 
time remaining to the end, and so on. 

Note that due to editing before transmis- 
sion, it may be possible that the caption infor- 
mation is occasionally moved down a scan line 
or two. Therefore, caption decoders should 
monitor more than just lines 21 and 284 for 
caption information. 

Waveform 

The data format for both lines consists of a 
clock run-in signal, a start bit, and two 7-bit 
plus parity words of ASCII data (per X3.4- 
1967). For YPbPr and S-video interfaces, cap- 



tioning is present on the Y signal. For analog 
RGB interfaces, captioning is present on the 
green channel, at an amplitude 1.7x of that 
used for composite or Y. 

Figure 8.53 illustrates the waveform and 
timing for transmitting the closed captioning 
and XDS information and conforms to CEA- 
608. The clock run-in is a 7-cycle sinusoidal 
burst that is frequency-locked and phase- 
locked to the caption data and is used to pro- 
vide synchronization for the decoder. The 
nominal data rate is 32x F H . However, decod- 
ers should not rely on this timing relationship 
due to possible horizontal timing variations 
introduced by video processing circuitry and 
VCRs. After the clock run-in signal, the blank- 
ing level is maintained for a two data bit dura- 
tion, followed by a “1” start bit. The start bit is 



10.5 ±0.25 )iS 12.91 nS 



7 CYCLES TWO 7-BIT + PARITY 

OF 0.5035 MHZ ASCII CHARACTERS 

(CLOCK RUN-IN) (DATA) 



BLANK LEVEL 



50 ±2 IRE 



3.58 MHZ 
COLOR BURST 
(9 CYCLES) 



40 IRE 



V V 



D0-D6 



P D0-D6 P 
A A 

R R 

1 1 

T | T 

Y Y 



240-288 NS 
RISE / FALL 
TIMES 
(2T BAR 
SHAPING) 



SYNC LEVEL 



10.003 ±0.25 pS 



27.382 |uS 



33.764 pS 



Figure 8.53. 525-Line Lines 21 and 284 Closed Captioning Timing. 
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followed by 16 bits of data, composed of two 7- 
bit + odd parity ASCII characters. Caption data 
is transmitted using a non-return-to-zero 
(NRZ) code; a “1” corresponds to the 50+2 IRE 
level and a “0” corresponds to the blanking 
level (0-2 IRE). The negative-going crossings 
of the clock are coherent with the data bit tran- 
sitions. 

Typical decoders specify the time between 
the 50% points of sync and clock run-in to be 
10.5 +0.5 jus, with a +3% tolerance on F^, 50 +12 
IRE for a “1” bit, and -2 to +12 IRE for a “0” bit. 
Decoders must also handle bit rise/fall times 
of 240-480 ns. 

NUL characters (0x00) should be sent 
when no display or control characters are 
being transmitted. This, in combination with 
the clock run-in, enables the decoder to deter- 
mine whether captioning or text transmission 
is being implemented. 

If using only line 21, the clock run-in and 
data do not need to be present on line 284. 
However, if using only line 284, the clock run- 
in and data should be present on both lines 21 
and 284; data for line 21 would consist of NUL 
characters. 

At the decoder, as shown in Figure 8.54, 
the display area of a 525-line 4:3 interlaced dis- 
play is typically 15 rows high and 34 columns 
wide. The vertical display area begins on lines 
43 and 306 and ends on lines 237 and 500. The 
horizontal display area begins 13 ps and ends 
58 ps, after the leading edge of horizontal sync. 

In text mode, all rows are used to display 
text; each row contains a maximum of 32 char- 
acters, with at least a one-column wide space 
on the left and right of the text. The only trans- 
parent area is around the outside of the text 
area. 

In caption mode, text usually appears only 
on rows 1-4 or 12-15; the remaining rows are 
usually transparent. Each row contains a maxi- 



mum of 32 characters, with at least a one-col- 
umn wide space on the left and right of the 
text. 

Some caption decoders support up to 48 
columns per row, and up to 16 rows, allowing 
some customization for the display of caption 
data. 

Basic Services 

There are two types of basic services: text 
mode (a data service generally not program 
related) and captioning. 

In understanding the operation of the 
decoder, it is easier to visualize an invisible cur- 
sor that marks the position where the next 
character will be displayed. Note that if you are 
designing a decoder, you should obtain the lat- 
est CEA-608 specification to ensure correct 
operation, as this section is only a summary. 

Text Mode 

Text mode, based on real-time scrolling, 
uses 7-15 rows of the display and is enabled 
upon receipt of the Resume Text Display or 
Text Restart code. When text mode has been 
selected, and the text memory is empty, the 
cursor starts at the top-most row, character 1 
position. Once all the rows of text are dis- 
played, scrolling is enabled. 

With each carriage return received, the 
top-most row of text is erased, the remaining 
text is smoothly rolled up one row (using 6-13 
uniform steps over 12-26 fields), the bottom 
row is erased, and the cursor is moved to the 
bottom row, character 1 position. If new text is 
received while scrolling, it is seen scrolling up 
from the bottom of the display area. If a car- 
riage return is received while scrolling, the 
rows are immediately moved up one row to 
their final position. 

Once the cursor moves to the character 32 
position on any row, any text received before a 
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carriage return, preamble address code, or 
backspace will be displayed at the character 32 
position, replacing any previous character at 
that position. The Text Restart command 
erases all characters on the display and moves 
the cursor to the top row, character 1 position. 

Additional real-time display methods can 
be optionally implemented by the decoder and 
used under viewer control. 

Captioning Mode 

Captioning has several modes available, 
including roll-up, pop-on, and paint-on. 

Roll-up captioning is enabled by receiving 
one of the miscellaneous control codes to 
select the number of rows displayed. “Roll-up 
captions, 2 rows” enables rows 14 and 15; “roll- 
up captions, 3 rows” enables rows 13-15; “roll- 



up captions, 4 rows” enables rows 12-15. 
Regardless of the number of rows enabled, the 
cursor remains on row 15. Once row 15 is full, 
the rows are scrolled up one row (at the rate of 
one dot per frame), and the cursor is moved 
back to row 15, character 1. 

Pop-on captioning may use rows 1-4 or 12- 
15, and is initiated by the Resume Caption 
Loading command. The display memory is 
essentially double-buffered. While memory 
buffer 1 is displayed, memory buffer 2 is being 
loaded with caption data. At the receipt of an 
End of Caption code, memory buffer 2 is dis- 
played while memory buffer 1 is being loaded 
with new caption data. 

Paint-on captioning, enabled by the 
Resume Direct Captioning command, is simi- 
lar to Pop-on captioning, but no double-buffer- 




CAPTIONS 

OR 

INFOTEXT 



INFOTEXT 

ONLY 



CAPTIONS 

OR 

INFOTEXT 



Figure 8.54. Closed Captioning Display Format. 
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ing is used; caption data is loaded directly into 
display memory. 

Three types of control codes (preamble 
address codes, midrow codes, and miscella- 
neous control codes) are used to specify the 
format, location, and attributes of the charac- 
ters. Each control code consists of two bytes, 
transmitted together on line 21 or line 284. On 
line 21, they are normally transmitted twice in 
succession to help ensure correct reception. 
They are not transmitted twice on line 284 to 
minimize bandwidth used for captioning. 

The first byte is a nondisplay control byte 
with a range of 0x10 to OxlF; the second byte is 
a display control byte in the range of 0x20 to 
0x7F. At the beginning of each row, a control 
code is sent to initialize the row. Caption roll-up 
and text modes allow either a preamble 
address code or midrow control code at the 
start of a row; the other caption modes use a 
preamble address code to initialize a row. The 
preamble address codes are illustrated in Fig- 
ure 8.55 and Table 8.28. 



The midrow codes are typically used 
within a row to change the color, italics, under- 
line, and flashing attributes and should occur 
only between words. Color, italics, and under- 
line are controlled by the preamble address 
and midrow codes; flash on is controlled by a 
miscellaneous control code. An attribute 
remains in effect until another control code is 
received or the end of row is reached. Each 
row starts with a control code to set the color 
and underline attributes (white nonunderlined 
is the default if no control code is received 
before the first character on an empty row). 
The color attribute can be changed only by the 
midrow code of another color; the italics 
attribute does not change the color attribute. 
However, a color attribute turns off the italics 
attribute. The flash on command does not alter 
the status of the color, italics, or underline 
attributes. However, a color or italics midrow 
control code turns off the flash. Note that the 
underline color is the same color as the charac- 
ter being underlined; the underline resides on 



PREAMBLE CONTROL CODE 
(TRANSMITTED TWICE) 



CAPTION TEXT UP TO 32 
CHARACTERS PER ROW 



b-zzi 





FIRST 




SECOND 




START 

BIT 


TEXT 


ODD 


TEXT 


ODD 


CHARACTER 


PARITY 


CHARACTER 


PARITY 


(7 BITS 
LSB FIRST) 


BIT 


(7 BITS 
LSB FIRST) 


BIT 





NON-DISPLAY 




DISPLAY 




START 

BIT 


CONTROL 


ODD 


CONTROL 


ODD 


CHARACTER 


PARITY 


CHARACTER 


PARITY 


(7 BITS 
LSB FIRST) 


BIT 


(7 BITS 
LSB FIRST) 


BIT 



IDENTIFICATION CODE, ROW POSITION, INDENT, 
AND DISPLAY CONDITION INSTRUCTIONS 



BEGINNING OF DISPLAYED 
CAPTION 



Figure 8.55. Closed Captioning Preamble Address Code Format. 
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Notes : 

1. U: “0” = no underline, “1” = underline. 

2. CH: “0” = data channel 1, “1” = data channel 2. 



A 


B 


c 


D 


Attribute 


0 


0 


0 


0 


white 


0 


0 


0 


1 


green 


0 


0 


1 


0 


blue 


0 


0 


1 


1 


cyan 


0 


1 


0 


0 


red 


0 


1 


0 


1 


yellow 


0 


1 


1 


0 


magenta 


0 


1 


1 


1 


white italics 


1 


0 


0 


0 


indent 0, white 


1 


0 


0 


1 


indent 4, white 


1 


0 


1 


0 


indent 8, white 


1 


0 


1 


1 


indent 12, white 


1 


1 


0 


0 


indent 16, white 


1 


1 


0 


1 


indent 20, white 


1 


1 


1 


0 


indent 24, white 


1 


1 


1 


1 


indent 28, white 



Table 8.28. Closed Captioning Preamble Address Codes. In text mode, the indent codes 
may be used to perform indentation; in this instance, the row information is ignored. 
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dot row 11 and covers the entire width of the 
character column. 

Table 8.29, Figure 8.56, and Table 8.30 
illustrate the midrow and miscellaneous con- 
trol code operation. For example, if it were the 
end of a caption, the control code could be End 
of Caption (transmitted twice) . It could be fol- 
lowed by a preamble address code (transmit- 
ted twice) to start another line of captioning. 

Characters are displayed using a dot 
matrix format. Each character cell is typically 
16 samples wide and 26 samples high (16 x 
26), as shown in Figure 8.57. Dot rows 2-19 
are usually used for actual character outlines. 
Dot rows 0, 1, 20, 21, 24, and 25 are usually 
blanked to provide vertical spacing between 
characters, and underlining is typically done 
on dot rows 22 and 23. Dot columns 0, 1, 14 
and 15 are blanked to provide horizontal spac- 
ing between characters, except on dot rows 22 
and 23 when the underline is displayed. This 



results in 12 x 18 characters stored in charac- 
ter ROM. Table 8.31 shows the basic character 
set. 

Some caption decoders support multiple 
character sizes within the 16 x 26 region, 
including 13 x 16, 13 x 24, 12 x 20, and 12 x 26. 
Not all combinations generate a sensible result 
due to the limited display area available. 

Optional Captioning Features 

Three sets of optional features are avail- 
able for advanced captioning decoders. 

Optional Attributes 

Additional color choices are available for 
advanced captioning decoders, as shown in 
Table 8.32. 

If a decoder doesn’t support semitranspar- 
ent colors, the opaque colors may be used 
instead. If a specific background color isn’t 
supported by a decoder, it should default to the 



Non-display Control Byte 


Display Control Byte 


Attribute 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


D6 


D5 


D4 


D3 


D2 


D1 


DO 






















0 


0 


0 




white 






















0 


0 


1 




green 






















0 


1 


0 




blue 






















0 


1 


1 




cyan 


0 


0 


1 


CH 


0 


0 


1 


0 


1 


0 








u 
























1 


0 


0 




red 






















1 


0 


1 




yellow 






















1 


1 


0 




magenta 






















1 


1 


1 




italics 



Notes : 

1. U: “0” = no underline, “1” = underline. 

2. CH: “0” = data channel 1, “1” = data channel 2. 

3. Italics is implemented as a two-dot slant to the right over the vertical range of the character. Some 
decoders implement a one-dot slant for every four scan lines. Underline resides on dot rows 22 and 
23, and covers the entire column width. 



Table 8.29. Closed Captioning Midrow Codes. 
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TEXT 



MID-ROW CONTROL CODE 
(TRANSMITTED TWICE) 




START 

BIT 



TEXT 

CHARACTER 
(7 BITS 
LSB FIRST) 



ODD 

PARITY 

BIT 



TEXT 

CHARACTER 
(7 BITS 
LSB FIRST) 



ODD 

PARITY 

BIT 





NON-DISPLAY 




DISPLAY 




START 

BIT 


CONTROL 


ODD 


CONTROL 


ODD 


CHARACTER 


PARITY 


CHARACTER 


PARITY 


(7 BITS 
LSB FIRST) 


BIT 


(7 BITS 
LSB FIRST) 


BIT 



Figure 8.56. Closed Captioning Midrow Code Format. Miscellaneous control codes may also be 
transmitted in place of the midrow control code. 



Non-display Control Byte 


Display Control Byte 


Command 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


D6 


D5 


D4 


D3 


D2 


D1 


DO 






















0 


0 


0 


0 


resume caption loading 






















0 


0 


0 


1 


backspace 






















0 


0 


1 


0 


reserved 






















0 


0 


1 


1 


reserved 






















0 


1 


0 


0 


delete to end of row 






















0 


1 


0 


1 


roll-up captions, 2 rows 






















0 


1 


1 


0 


roll-up captions, 3 rows 


0 


0 


1 


CH 


1 


0 


F 


0 


1 


0 


0 


1 


1 


1 


roll-up captions, 4 rows 


1 


0 


0 


0 


flash on 






















1 


0 


0 


1 


resume direct captioning 






















1 


0 


1 


0 


text restart 






















1 


0 


1 


1 


resume text display 






















1 


1 


0 


0 


erase displayed memory 






















1 


1 


0 


1 


carriage return 






















1 


1 


1 


0 


erase nondisplayed memory 






















1 


1 


1 


1 


end of caption (flip memories) 






















0 


0 


0 


1 


tab offset (1 column) 


0 


0 


1 


CH 


1 


1 


1 


0 


1 


0 


0 


0 


1 


0 


tab offset (2 columns) 






















0 


0 


1 


1 


tab offset (3 columns) 



Notes : 

1. F: “0” = line 21, “1” = line 284. CH: “0” = data channel 1, “1” = data channel 2. 

2. “Flash on” blanks associated characters for 0.25 seconds once per second. 



Table 8.30. Closed Captioning Miscellaneous Control Codes. 
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DOT 

ROW 



LINE 43 
LINE 306 


— o 


-o- 


o 


-o- 


o 


o 


o 


-o- 


o- 


o 


o 


o 


-o- 


-o 


o 


-o — 


0 






































LINE 307 




































— o 


o 


• 


• 


• 


• 


o 


o 


o 


o 


• 


• 


• 


• 




-o — 


4 


LINE 45 




LINE 308 


































— o 


o 


• 


• 


o 


• 


-• 


• 




• 


• 


o 


• 


• 




-o — 


6 


LINE 46 




LINE 309 




































































LINE 47 








































































LINE 48 








































































LINE 31 1 
LINE 49 








































































LINE 50 






































































LINE 313 




































LINE '"l 


LINE 314 
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LINE 318 



O BLANK DOT 



• CHARACTER DOT 



UNDERLINE 



Figure 8.57. Typical 16x26 Closed Captioning Character Cell Format for Row 1. 
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Table 8.31. Closed Captioning Basic Character Set. 
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Non-display Control Byte 


Display Control Byte 


Background 

Attribute 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


0 


0 


1 


CH 


0 


0 


0 


0 


1 


0 


0 


0 


0 


T 


white 


0 


0 


1 


green 


0 


1 


0 


blue 


0 


1 


1 


cyan 


1 


0 


0 


red 


1 


0 


1 


yellow 


1 


1 


0 


magenta 


1 


1 


1 


black 


0 


0 


1 


CH 


1 


1 


1 


0 


1 


0 


1 


1 


0 


1 


transparent 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


Foreground 

Attribute 


0 


0 


1 


CH 


1 


1 


1 


0 


1 


0 


1 


1 


1 


0 


black 


1 


black underline 



Notes : 

1. F: “0” = opaque, “1” = semi-transparent. 

2. CH: “0” = data channel 1, “1” = data channel 2. 

3. Underline resides on dot rows 22 and 23, and covers the entire column width. 

Table 8.32. Closed Captioning Optional Attribute Codes. 



black background color. However, if the black 
foreground color is supported in a decoder, all 
the background colors should be imple- 
mented. 

A background attribute appears as a stan- 
dard space on the display, and the attribute 
remains in effect until the end of the row or 
until another background attribute is received. 

The foreground attributes provide an 
eighth color (black) as a character color. As 
with midrow codes, a foreground attribute 
code turns off italics and blinking, and the 
least significant bit controls underlining. 

Background and foreground attribute 
codes have an automatic backspace for back- 
ward compatibility with current decoders. 
Thus, an attribute must be preceded by a stan- 
dard space character. Standard decoders dis- 



play the space and ignore the attribute. 
Extended decoders display the space, and on 
receiving the attribute, backspace, then display 
a space that changes the color and opacity. 
Thus, text formatting remains the same 
regardless of the type of decoder. 

Optional Closed Group Extensions 

To support new features and characters 
not defined by the current standard, the CEA 
maintains a set of code assignments requested 
by various caption providers and decoder man- 
ufacturers. These code assignments (currently 
used to select various Asian character sets) are 
not compatible with caption decoders in the 
United States and videos using them should 
not be distributed in the U.S. market. 
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Closed group extensions require two 
bytes. Table 8.33 lists the currently assigned 
closed group extensions to support captioning 
in the Asian languages. 

Optional Extended Characters 

An additional 64 accented characters 
(eight character sets of eight characters each) 
may be supported by decoders, permitting the 
display of other languages such as Spanish, 
French, Portuguese, German, Danish, Italian, 
Finnish, and Swedish. If supported, these 
accented characters are available in all caption 
and text modes. 

Each of the extended characters incorpo- 
rates an automatic backspace for backward 
compatibility with current decoders. Thus, an 
extended character must be preceded by the 



standard ASCII version of the character. Stan- 
dard decoders display the ASCII character and 
ignore the accented character. Extended 
decoders display the ASCII character, and on 
receiving the accented character, backspace, 
then display the accented character. Thus, text 
formatting remains the same regardless of the 
type of decoder. 

Extended characters require two bytes. 
The first byte is 0x12 or 0x13 for data channel 
one (OxlA or OxlB for data channel two), fol- 
lowed by a value of 0x20-0x3F. 

Extended Data Services 

F ine 284 may contain extended data ser- 
vice information, interleaved with the caption 
and text information, as bandwidth is available. 
In this case, control codes are not transmitted 



Non-display Control Byte 


Display Control Byte 


Background 

Attribute 


D6 


D5 


D4 


D3 


D2 


D1 


DO 


D6 


D5 


D4 


D3 


D2 


D1 


DO 






















0 


1 


0 


0 


standard character 
set (normal size) 






















0 


1 


0 


1 


standard character 
set (double size) 






















0 


1 


1 


0 


first private 
character set 


0 


0 


1 


CH 


1 


1 


1 


0 


1 


0 


0 


1 


1 


1 


second private 
character set 


1 


0 


0 


0 


People’s Republic 
of China character 
set (GB 2312) 






















1 


0 


0 


1 


Korean Standard 
character set 
(KSC 5601-1987) 






















1 


0 


1 


0 


first registered character set 



Notes : 

1. CH: “0” = data channel 1, “1” = data channel 2. 



Table 8.33. Closed Captioning Optional Closed Group Extensions. 
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twice, as they may be for the caption and text 
services. 

Information is transmitted as packets and 
operates as a separate unique data channel. 
Data for each packet may or may not be contig- 
uous and may be separated into subpackets 
that can be inserted anywhere space is avail- 
able in the line 284 information stream. 

There are four types of extended data char- 
acters: 

Control: Control characters are used as a mode 
switch to enable the extended data mode. 

They are the first character of two and have a 
value of 0x01 to QxOF. 

Type: Type characters follow the control char- 
acter (thus, they are the second character of 
two) and identify the packet type. They have a 
value of 0x01 to QxOF. 

Checksum: Checksum characters always follow 
the “end of packet” control character. Thus, 
they are the second character of two and have 
a value of 0x00 to 0x7F. 

Informational: These characters may be ASCII 
or non-ASCII data. They are transmitted in 
pairs up to and including 32 characters. A NUL 
character (0x00) is used to ensure pairs of 
characters are always sent. 

Control Characters 

Table 8.34 lists the control codes. Current 
class describes a program currently being 
transmitted. Future programming describes a 
program to be transmitted later. It contains the 
same information and formats as the current 
class. Channel class describes non-program- 
specific information about the channel. Miscel- 
laneous describes miscellaneous information. 
Public service class transmits data or messages 
of a public service nature. Private data class is 
used in proprietary systems for whatever that 
system wishes. 



Type Definitions (Current Class and Future 
Programming) 

Program Identification Number (0x01) 

This packet uses four characters to specify 
a scheduled start time and date relative to 
Coordinated Universal Time (UTC). The for- 
mat is shown in Table 8.35. 

Minutes have a range of 0-59. Hours have 
a range of 0-23. Dates have a range of 1-31. 
Months have a range of 1-12. “T” indicates if a 
program is routinely tape delayed for the 
Mountain and Pacific time zones. The “D,” “L,” 
and “Z” bits are ignored by the decoder. When 
all characters are a “1,” it indicates the end of 
the current program. 

Length / Time-in-Show (0x02) 

This packet has 2, 4, or 6 characters and 
indicates the scheduled length of the program 
and elapsed time for the program. The format 
is shown in Table 8.36. 

Minutes and seconds have a range of 0-59. 
Hours have a range of 0-63. 

Program Name (0x03) 

This packet contains 2-32 ASCII charac- 
ters that specify the title of the program. 

Program Type (0x04) 

This packet contains 2-32 characters that 
specify the type of program. Each character is 
coded to a keyword, as shown in Table 8.37. 

Content Advisory (0x05) 

This packet, commonly referred to regard- 
ing the “V-chip,” contains the information 
shown in Table 8.38 to indicate the program 
rating. 

FV indicates if fantasy violence is present. 
V indicates if violence is present. S indicates if 
sexual situations are present. L indicates if 
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Control 

Code 


Function 


Class 


0x01 


start 


current 


0x02 


continue 


0x03 


start 


future 


0x04 


continue 


0x05 


start 


channel information 


0x06 


continue 


0x07 


start 


miscellaneous 


0x08 


continue 


0x09 


start 


public service 


OxOA 


continue 


OxOB 


start 


reserved 


OxOC 


continue 


OxOD 


start 


private data 


OxOE 


continue 


OxOF 


end 


all 



Table 8.34. CEA-608 Control Codes. 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


m5 


m4 


m3 


m2 


ml 


mO 


minute 


1 


D 


h4 


h3 


h2 


hi 


hO 


hour 


1 


L 


d4 


d3 


d2 


dl 


dO 


date 


1 


Z 


T 


m3 


m2 


ml 


mO 


month 



Table 8.35. CEA-608 Program Identification Number Format. 
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D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


m5 


m4 


m3 


m2 


ml 


mO 


length, minute 


1 


h5 


h4 


h3 


h2 


hi 


hO 


length, hour 


1 


m5 


m4 


m3 


m2 


ml 


mO 


elapsed time, minute 


1 


h5 


h4 


h3 


h2 


hi 


hO 


elapsed time, hour 


1 


s5 


s4 


s3 


s2 


sl 


sO 


elapsed time, second 


0 


0 


0 


0 


0 


0 


0 


null character 



Table 8.36. CEA-608 Length / Time-in-Show Format. 



Code 

(hex) 


Keyword 


Code 

(hex) 


Keyword 


Code 

(hex) 


Keyword 


20 


education 


30 


business 


40 


fantasy 


21 


entertainment 


31 


classical 


41 


farm 


22 


movie 


32 


college 


42 


fashion 


23 


news 


33 


combat 


43 


fiction 


24 


religious 


34 


comedy 


44 


food 


25 


sports 


35 


commentary 


45 


football 


26 


other 


36 


concert 


46 


foreign 


27 


action 


37 


consumer 


47 


fund raiser 


28 


advertisement 


38 


contemporary 


48 


game/ quiz 


29 


animated 


39 


crime 


49 


garden 


2A 


anthology 


3A 


dance 


4A 


golf 


2B 


automobile 


3B 


documentary 


4B 


government 


2C 


awards 


3C 


drama 


4C 


health 


2D 


baseball 


3D 


elementary 


4D 


high school 


2E 


basketball 


3E 


erotica 


4E 


history 


2F 


bulletin 


3F 


exercise 


4F 


hobby 


50 


hockey 


60 


music 


70 


romance 


51 


home 


61 


mystery 


71 


science 


52 


horror 


62 


national 


72 


series 


53 


information 


63 


nature 


73 


service 


54 


instruction 


64 


police 


74 


shopping 


55 


international 


65 


politics 


75 


soap opera 


56 


interview 


66 


premiere 


76 


special 


57 


language 


67 


prerecorded 


77 


suspense 


58 


legal 


68 


product 


78 


talk 


59 


live 


69 


professional 


79 


technical 


5A 


local 


6A 


public 


7A 


tennis 


5B 


math 


6B 


racing 


7B 


travel 


5C 


medical 


6C 


reading 


7C 


variety 


5D 


meeting 


6D 


repair 


7D 


video 


5E 


military 


6E 


repeat 


7E 


weather 


5F 


miniseries 


6F 


review 


7F 


western 



Table 8.37. CEA-608 Program Types. 
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adult language is present. D indicates if sexu- 
ally suggestive dialog is present. 

Audio Services (0x06) 

This packet contains two characters as 
shown in Table 8.39 to indicate the audio lan- 
guage and type available. 

Caption Services (0x07) 

This packet contains 2-8 characters as 
shown in Table 8.40 to indicate the program 
caption services available. L2-L0 are coded as 
shown in Table 8.39. 

Copy and Redistribution Control Packet (0x08) 
This CGMS-A (Copy Generation Manage- 
ment System — Analog) and Redistribution 
Control Descriptor (RCD) packet contains 2 
characters as shown in Table 8.41. 

In the case where either B3 or B4 is a “0,” 
there is no Analog Protection Service (B1 and 
B2 are “0”) . B0 is the analog source bit. 

When RCD is a “1,” control of consumer 
redistribution has been signaled in some man- 
ner, such as the presence of the ATSC Redistri- 
bution Control Descriptor. 

Composite Packet-1 (OxOC) 

This packet is a way of conveying several 
packets as a single group. It contains the Pro- 
gram Type (5 characters) , Content Advisory (1 
character), Length (2 characters), Time-in- 
Show (2 characters), and Program Name (0-22 
characters) , 

Composite Packet-2 (OxOD) 

This packet is a way of conveying several 
packets as a single group. It contains the Pro- 
gram ID (4 characters), Audio Services (2 
characters), Caption Services (2 characters), 
Call Letters (4 characters), Native Channel (2 
characters), and Network Name (0-18 charac- 
ters) . 



Program Description Row 1 to Row 8 (0x10- 
0x171 

This packet contains 1-8 packet rows, with 
each packet row containing 0-32 ASCII charac- 
ters. A packet row corresponds to a line of text 
on the display. 

Each packet is used in numerical 
sequence, and if a packet contains no ASCII 
characters, a blank line will be displayed. 

Type Definitions (Channel Information Class) 

Network Name (0x011 

This packet uses 2-32 ASCII characters to 
specify the network name. 

Call Letters and Native Channel (0x021 

This packet uses four or six ASCII charac- 
ters to specify the call letters of the channel. 
When six characters are used, they reflect the 
over-the-air channel number (2-69) assigned 
by the FCC. Single-digit channel numbers are 
preceded by a zero or a null character. 

Tape Delay (0x031 

This packet uses two characters to specify 
the number of hours and minutes the local sta- 
tion typically delays network programs. The 
format of this packet is shown in Table 8.42. 

Minutes have a range of 0-59. Hours have 
a range of 0-23. This delay applies to all pro- 
grams on the channel that have the “T” bit set 
in their Program ID packet (Table 8.35) . 

Transmission Signal Identifier (0x041 

This packet contains four characters that 
convey the unique 16-bit Transmission Signal 
Identifier (TSID) assigned to the originating 
analog licensee. The format of this packet is 
shown in Table 8.43. 
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D6 


D5 


D4 


D3 


D2 


D1 


DO 


1 


D/a2 


al 


aO 


r2 


rl 


rO 


1 


V/FV 


s 


L/a3 


g2 


gl 


go 



MPA rating 


g2-g0: U.S. TV rating a3-a0: 


xxxO 


MPA rating 


000 


not applicable 


000 


not rated 


LD01 


U.S. TV parental guidelines 


001 


G 


001 


TV-Y 


0011 


Canada English language rating 


010 


PG 


010 


TV-Y7 


0111 


Canada French language rating 


011 


PG-13 


011 


TV-G 


1011 


reserved 


100 


R 


100 


TV-PG 


1111 


reserved 


101 


NC-17 


101 


TV-14 






110 


X 


110 


TV-MA 






111 


not rated 


111 


not rated 






Canada English language rating g2-g0: 


Canada French language rating 


000 


E 




000 E 






001 


C 




001 G 






010 


C8 + 




010 8 ans + 






011 


G 




011 13 ans + 






100 


PG 




100 16 ans + 






101 


14 + 




101 18 ans + 






110 


18 + 




110 reserved 






111 


reserved 




111 reserved 







Table 8.38. CEA-608 Content Advisory Format. 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


L2 


LI 


L0 


T2 


T1 


TO 


main audio program 


1 


L2 


LI 


L0 


S2 


SI 


SO 


second audio program (SAP) 



000 


unknown 


T2-T0: 000 


unknown 


S2-S0: 000 


unknown 


001 


english 


001 


mono 


001 


mono 


010 


Spanish 


010 


simulated stereo 


010 


video descriptions 


Oil 


french 


Oil 


true stereo 


Oil 


non-program audio 


100 


german 


100 


stereo surround 


100 


special effects 


101 


italian 


101 


data service 


101 


data service 


110 


other 


110 


other 


110 


other 


111 


none 


111 


none 


111 


none 



Table 8.39. CEA-608 Audio Services Format. 
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D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


L2 


LI 


LO 


F 


C 


T 


service code 



FCT: 000 


line 21, data channel 1 captioning 


001 


line 21, data channel 1 text 


010 


line 21, data channel 2 captioning 


Oil 


line 21, data channel 2 text 


100 


line 284, data channel 1 captioning 


101 


line 284, data channel 1 text 


110 


line 284, data channel 2 captioning 


111 


line 284, data channel 2 text 



Table 8.40. CEA-608 Caption Services Format. 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


1 


0 


B4 


B3 


B2 


B1 


B0 


1 


0 


0 


0 


0 


0 


RCD 



B4-B3 


CGMS-A Services: 


00 


copying permitted without restriction 


01 


no more copies 


10 


one generation copy allowed 


11 


no copying permitted 


B2-B1 


Analog Protection Services (APS) 


00 


no Analog Protection Service 


01 


PSP on, color striping off 


10 


PSP on, 2-line color striping on 


11 


PSP on, 4-line color striping on 



Table 8.41. CEA-608 Copy and Redistribution Control Packet Format. 
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D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


m5 


m4 


m3 


m2 


ml 


mO 


minute 


1 


- 


h4 


h3 


h2 


hi 


hO 


hour 



Table 8.42. CEA-608 Tape Delay Format. 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


- 


- 


t3 


t2 


tl 


to 


TSIDO 


1 


- 


- 


t7 


t5 


t5 


t4 


TSID 1 


1 


- 


- 


til 


tio 


t9 


t8 


TSID 2 


1 


- 


- 


tl5 


tl4 


tl3 


tl2 


TSID 4 



Table 8.43. CEA-608 Transmission Signal Identifier (TSID) Format. 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


m5 


m4 


m3 


m2 


ml 


mO 


minute 


1 


D 


h4 


h3 


h2 


hi 


hO 


hour 


1 


L 


d4 


d3 


d2 


dl 


dO 


date 


1 


Z 


T 


m3 


m2 


ml 


mO 


month 


1 


- 


- 


- 


D2 


Dl 


DO 


day 


1 


Y5 


Y4 


Y3 


Y2 


Y1 


YO 


year 



Table 8.44. CEA-608 Time of Day Format. 
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Type Definitions (Miscellaneous) 

Time of Dav (0x01) 

This packet uses six characters to specify 
the current time of day, month, and date rela- 
tive to Coordinated Universal Time (UTC). 
The format is shown in Table 8.44. 

Minutes have a range of 0-59. Hours have 
a range of 0-23. Dates have a range of 1-31. 
Months have a range of 1-12. Days have a 
range of 1 (Sunday) to 7 (Saturday). Years 
have a range of 0-63 (added to 1990) . 

“T” indicates if a program is routinely tape 
delayed for the Mountain and Pacific time 
zones. “D” indicates whether daylight savings 
time currently is being observed. “L” indicates 
whether the local day is February 28th or 29th 
when it is March 1st UTC. “Z” indicates 
whether the seconds should be set to zero (to 
allow calibration without having to transmit the 
full 6 bits of seconds data) . 

Impulse Capture ID (0x021 

This packet carries the program start time 
and length, and can be used to tell a YCR to 
record this program. The format is shown in 
Table 8.45. 

Start and length minutes have a range of 
0-59. Start hours have a range of 0-23; length 
hours have a range of 0-63. Dates have a range 
of 1-31. Months have a range of 1-12. “T” indi- 
cates if a program is routinely tape delayed for 
the Mountain and Pacific time zones. The “D,” 
“L,” and “Z” bits are ignored by the decoder. 

Supplemental Data Location (0x03) 

This packet uses 2-32 characters to specify 
other lines where additional VBI data may be 
found. Table 8.46 shows the format. 

“F” indicates field one (“0”) or field two 
(“1”). N may have a value of 7-31, and indi- 
cates a specific line number. 



Local Time Zone and DST Use (0x041 

This packet uses two characters to specify 
the viewer time zone and whether the locality 
observes daylight savings time. The format is 
shown in Table 8.47. 

Hours have a range of 0-23. This is the 
nominal time zone offset, in hours, relative to 
UTC. “D” is a “1” when the area is using day- 
light savings time. 

Out-of-Band Channel Number (0x401 

This packet uses two characters to specify 
a channel number to which all subsequent out- 
of-band packets refer. This is the CATV chan- 
nel number to which any following out-of-band 
packets belong to. The format is shown in 
Table 8.48. 

Channel Map Pointer (0x41) 

This packet uses two characters to specify 
the channel number containing the Channel 
Map Header and Channel Map packets. 

Channel Map Header Packet (0x421 

This packet uses four characters to specify 
the number of channels in the channel map 
and current version number for the current 
map. 

Channel Map Packet (0x43) 

This packet uses two or four characters to 
specify the user channel number and its corre- 
sponding tuner channel number. Up to 6 
optional closed caption characters are included 
to convey the user channel’s call letters or net- 
work ID. 




VBI Data 367 



D6 


D5 


D4 


D3 


D2 


D1 


DO 


Character 


1 


m5 


m4 


m3 


m2 


ml 


mO 


start, minute 


1 


D 


h4 


h3 


h2 


hi 


hO 


start, hour 


1 


L 


d4 


d3 


d2 


dl 


dO 


start, date 


1 


Z 


T 


m3 


m2 


ml 


mO 


start, month 


1 


m5 


m4 


m3 


m2 


ml 


mO 


length, minute 


1 


h5 


h4 


h3 


h2 


hi 


hO 


length, hour 



Table 8.45. CEA-608 Impulse Capture ID Format. 



D6 


D5 


D4 


D3 


D2 


Dl 


DO 


Character 


1 


F 


N4 


N3 


N2 


N1 


NO 


location 



Table 8.46. CEA-608 Supplemental Data Location Format. 



D6 


D5 


D4 


D3 


D2 


Dl 


DO 


Character 


1 


D 


h4 


h3 


h2 


hi 


hO 


hour 


0 


0 


0 


0 


0 


0 


0 


null 



Table 8.47. CEA-608 Local Time Zone and DST Use Format. 



D6 


D5 


D4 


D3 


D2 


Dl 


DO 


Character 


1 


c5 


c4 


c3 


c2 


cl 


cO 


channel low 


1 


ell 


clO 


c9 


c8 


c7 


c6 


channel high 



Table 8.48. CEA-608 Out-of-Band Channel Number Format. 








368 Chapter 8: NTSC, PAL, and SECAM Overview 



Type Definitions (Public Service Class) 

National Weather Service Code (0x01) 

This packet conveys a weather-related 
emergency broadcast message that indicates 
the category, affected counties, and expiration 
time. 

National Weather Service Message (0x02) 

This packet conveys up to 32 characters of 
an actual text message as delivered by the 
National Weather Service. 

Caption (CC) and Text (T) Channels 

CC1, CC2, Tl, and T2 are on line 21. CC3, 
CC4, T3, and T4 are on line 284. A fifth channel 
on line 284 carries the Extended Data Ser- 
vices. T1-T4 are similar to CC1-CC4, but take 
over all or half of the screen to display scroll- 
ing text information. 



CC1 is usually the main caption channel. 
CC2 or CC3 is occasionally used for support- 
ing a second language version. 

Closed Captioning for PAL 

For (M) PAL, caption data may be present 
on lines 18 and 281; however, it may occasion- 
ally reside on any line between 18-22 and 281- 
285 due to editing. 

For (B, D, G, H, I, N, NC) PAL videotapes, 
caption data may be present on lines 22 and 
335; however, it may occasionally reside on any 
line between 22-26 and 335-339 due to editing. 
The data format, amplitudes, and rise and fall 
times match those used in the United States. 
The timing, as shown in Figure 8.58, is slightly 
different due to the 625-line horizontal timing. 



10.5 ±0.25 pS 13.0 pS 



7 CYCLES TWO 7-BIT + PARITY 

OF 0.500 MHZ ASCII CHARACTERS 

(CLOCK RUN-IN) (DATA) 



BLANK LEVEL 



4.43 MHZ 
COLOR BURST 
(10 CYCLES) 



240-288 NS 
RISE / FALL 
TIMES 
(2T BAR 
SHAPING) 



SYNC LEVEL 



10.00 ±0.25 pS 



27.5 pS 



34.0 pS 



Figure 8.58. 625-Line Lines 22 and 335 Closed Captioning Timing. 
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Widescreen Signaling and CGMS 

To facilitate the handling of various aspect 
ratios of program material received by TVs, a 
widescreen signaling (WSS) system has been 
developed. This standard allows a WSS- 
enhanced 16:9 TV to display programs in their 
correct aspect ratio. 

62 5i Systems 

625i (576i) systems are based on ITU-R 
BT.1119 and ETSI EN 300 294. For YPbPr and 
S-video interfaces, WSS is present on the Y sig- 
nal. For analog RGB interfaces, WSS is present 
on all three signals. 

The Analog Copy Generation Management 
System (CGMS-A) is also supported by the 
WSS signal. 

Data Timing 

For (B, D, G, H, I, N, NC) PAT, WSS data 
is normally on line 23, as shown in Figure 8.59. 
However, due to video editing, WSS data may 
reside on any line between 23-27. 

The clock frequency is 5 MHz (+100 Hz). 
The signal waveform should be a sine-squared 



pulse, with a half-amplitude duration of 200 +10 
ns. The signal amplitude is 500 mV +5%. 

The NRZ data bits are processed by a bi- 
phase code modulator, such that one data 
period equals 6 elements at 5 MHz. 

Data Content 

The WSS consists of a run-in code, a start 
code, and 14 bits of data, as shown in Table 

8.49. 

Run-In 

The run-in consists of 29 elements at 5 
MHz of a specific sequence, shown in Table 

8.49. 

Start Code 

The start code consists of 24 elements at 5 
MHz of a specific sequence, shown in Table 

8.49. 

Group A Data 

The group A data consists of 4 data bits 
that specify the aspect ratio. Each data bit gen- 
erates 6 elements at 5 MHz. bO is the FSB. 



BLANK LEVEL 



500 MV ±5% 



COLOR 

BURST 



43 IRE 



RUN 
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IN 
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(B0- B13) 


29 


24 


84 


5 MHZ 


5 MHZ 


5 MHZ 


ELEMENTS 


ELEMENTS 


ELEMENTS 



190-210 NS 
RISE / FALL 
TIMES 
(2T BAR 
SHAPING) 



SYNC LEVEL 



1 1 .00 ± 0.25 



BS 



27.4 pS 



Figure 8.59. 625-Line Line 23 WSS Timing. 




370 Chapter 8: NTSC, PAL, and SECAM Overview 



run-in 


29 elements 
at 5 MHz 


1 1111 0001 1100 0111 0001 1100 0111 
(OxlFlC 71C7) 


start code 


24 elements 
at 5 MHz 


0001 1110 0011 1100 0001 1111 
(OxlE 3C1F) 


group A 
(aspect ratio) 


24 elements 
at 5 MHz 
"0” = 000 111 
"1” = 111 000 


bO, bl, b2, b3 


group B 

(enhanced services) 


24 elements 
at 5 MHz 
"0” = 000 111 
"1” = 111 000 


b4, b5, b6, b7 

(b7 = “0” since reserved) 


group C 
(subtitles) 


18 elements 
at 5 MHz 
"0” = 000 111 
"1” = 111 000 


b8, b9, blO 


group D 
(reserved) 


18 elements 
at 5 MHz 
"0” = 000 111 
"1” = 111 000 


bll, bl2, bl3 



Table 8.49. 625-Line WSS Information. 



bO, bl, b2, b3 


Aspect Ratio 
Label 


Format 


Position on 
4:3 Display 


Active 

Lines 


Minimum 

Requirements 


0001 


4:3 


full format 


- 


576 


case 1 


1000 


14:9 


letterbox 


center 


504 


case 2 


0100 


14:9 


letterbox 


top 


504 


case 2 


1101 


16:9 


letterbox 


center 


430 


case 3 


0010 


16:9 


letterbox 


top 


430 


case 3 


1011 


>16:9 


letterbox 


center 


- 


case 4 


0111 


14:9 


full format 


center 


576 


- 


1110 


16:9 


full format 
(anamorphic) 


- 


576 


- 



Table 8.50. 625-Line WSS Group A (Aspect Ratio) Data Bit Assignments and Usage 
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Table 8.50 lists the data bit assignments 
and usage. The number of active lines listed in 
Table 8.50 are for the exact aspect ratio (a = 
1.33,1.56, or 1.78). 

The aspect ratio label indicates a range of 
possible aspect ratios (a) and number of active 
lines: 



4:3 


a < 1.46 




527-576 


14:9 


1.46 < a < 


1.66 


463-526 


16:9 


1.66 < a < 


1.90 


405-462 


>16:9 


a > 1.90 




<405 



To allow automatic selection of the display 
mode, a 16:9 receiver should support the fol- 
lowing minimum requirements: 

Case 1: The 4:3 aspect ratio picture should be 
centered on the display, with black bars on the 
left and right sides. 

Case 2: The 14:9 aspect ratio picture should be 
centered on the display, with black bars on the 
left and right sides. Alternately, the picture 
may be displayed using the full display width 
by using a small (typically 8%) horizontal geo- 
metrical error. 

Case 3: The 16:9 aspect ratio picture should be 
displayed using the full width of the display. 

Case 4: The >16:9 aspect ratio picture should 
be displayed as in Case 3 or use the full height 
of the display by zooming in. 



Group B Data 

The group B data consists of four data bits 
that specify enhanced services. Each data bit 
generates six elements at 5 MHz. Data bit b4 is 
the LSB. Bits b5 and b6 are used for PALplus. 

b4: mode 

0 camera mode 

1 film mode 



b5: color encoding 

0 normal PAL 

1 Motion Adaptive ColorPlus 

b6: helper signals 

0 not present 

1 present 



Group C Data 

The group C data consists of three data 
bits that specify subtitles. Each data bit gener- 
ates six elements at 5 MHz. Data bit b8 is the 
LSB. 

b8: teletext subtitles 

0 no 

1 yes 

b9, blO: open subtitles 

00 no 

01 outside active picture 

10 inside active picture 

11 reserved 



Group D Data 

The group D data consists of three data 
bits that specify surround sound and copy pro- 
tection. Each data bit generates six elements at 
5 MHz. Data bit bit is the LSB. 

bll: surround sound 

0 no 

1 yes 

bl2: copyright 

0 no copyright asserted or unknown 

1 copyright asserted 

bl3: copy protection 

0 copying not restricted 

1 copying restricted 
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52 5i Systems 

EIA-J CPR-1204 and IEC 61880 define a 
widescreen signaling standard for 525i (480i) 
systems. For YPbPr and S-video interfaces, 
WSS is present on the Y signal. For analog 
RGB interfaces, WSS is present on all three 
signals. 

Data Timing 

Fines 20 and 283 are used to transmit the 
WSS information, as shown in Figure 8.60. 
However, due to video editing, it may reside on 
any line between 20-24 and 283-287. 

The clock frequency is Fgc/8 or about 
447.443 kHz; Fgc is the color subcarrier fre- 
quency of 3.579545 MHz. The signal waveform 
should be a sine-squared pulse, with a half- 
amplitude duration of 2.235 ps +50 ns. The sig- 
nal amplitude is 70 +10 IRE for a “1,” and 0 +5 
IRE for a “0.” 

Data Content 

The WSS consists of 2 bits of start code, 14 
bits of data, and 6 bits of CRC, as shown in 
Table 8.51. The CRC used is X 6 + X + 1, all pre- 
set to “1.” 

Start Code 

The start code consists of a “1” data bit fol- 
lowed by a “0” data bit, as shown in Table 8.51. 

Word 0 Data 

Word 0 data consists of 2 data bits: 
bO, bl: 



00 


4:3 aspect ratio 


normal 


01 


4:3 aspect ratio 


letterbox 


10 


16:9 aspect ratio 


anamorphic 


11 


reserved 





Word 1 Data 

Word 1 data consists of 4 data bits: 

b2, b3, b4, b5: 

0000 copy control information 

1111 default 

Copy control information is transmitted in 
Word 2 data when Word 1 data is “0000.” When 
copy control information is not to be trans- 
ferred, Word 1 data must be set to the default 
value “1111.” 

Word 2 Data 

Word 2 data consists of 14 data bits. When 
Word 1 data is “0000,” Word 2 data consists of 
copy control information. Word 2 copy control 
data must be transferred at the rate of two or 
more frames per two seconds. 

Bits b6 and b7 specify the copy generation 
management system in an analog signal 
(CGMS-A) . CGMS-A consists of two bits of dig- 
ital information: 

b6, b7: 

00 copying permitted 

01 reserved 

10 one copy permitted 

11 no copying permitted 

This CGMS-A information must also usually be 
conveyed via the line 284 Extended Data Ser- 
vices Copy and Redistribution Control packet 
discussed in the closed captioning section. 

Bits b8 and b9 specify the Analog Protec- 
tion Service (APS) added to the analog NTSC 
or PAL video signal: 

b8, b9: 

00 no Analog Protection Service 

01 PSP on, color striping off 

10 PSP on, 2-line color striping on 

11 PSP on, 4-line color striping on 
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70 ±10 IRE 



COLOR 

BURST 



BLANK LEVEL 



40 IRE 



START 


START 


DATA 


CODE 

"1" 


CODE 

"0" 


(BO - B19) 



2235 ±50 NS 
RISE / FALL 
TIMES 
(2T BAR 
SHAPING) 



SYNC LEVEL 



1 1 .20 ± 0.30 



US 



49.1 ±0.44 



US 



Figure 8.60. 525-Line Lines 20 and 283 WSS Timing. 



start code 


“1” 


start code 


“0” 


word O 


bO, bl 


word 1 


b2, b3, b4, b5 


word 2 


b6, b7, b8, b9, blO, bll, bl2, bl3 


CRC 


b!4, b!5, b!6, b!7, b!8, b!9 



Table 8.51. 525-Line WSS Data Bit Assignments and Usage, 
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PSP is a pseudo-sync pulse operation that, 
if on, will be present on the composite, S-video, 
and Y (of YPbPr) analog video outputs. Color 
striping operation inverts the normal phase of 
the first half of the color burst signal on certain 
scan lines on the composite and S-video analog 
video outputs. 

This Analog Protection Service (APS) 
information must also usually be conveyed via 
the line 284 Extended Data Services Copy and 
Redistribution Control packet discussed in the 
closed captioning section. 

Bit blO specifies whether the source origi- 
nated from an analog pre-recorded medium. 

blO: 

0 not analog pre-recorded medium 

1 analog pre-recorded medium 

Bits bll, bl2, and bl3 are reserved and are 
“ 000 .” 

Teletext 

Teletext allows the transmission of text, 
graphics, and data. Data may be transmitted on 



any line, although the VBI interval is most 
commonly used. The teletext standards are 
specified by ETSI EN 300 706, ITU-R BT.653, 
and EIA-516. 

For YPbPr and S-video interfaces, teletext 
is present on the Y signal. For analog RGB 
interfaces, teletext is present on all three sig- 
nals. 

There are many systems that use the tele- 
text physical layer to transmit proprietary 
information. The advantage is that teletext has 
already been approved in many countries for 
broadcast, so certification for a new transmis- 
sion technique is not required. 

The data rate for teletext is much higher 
than that used for closed captioning, approach- 
ing up to 7 Mbps in some cases. Therefore, 
ghost cancellation is needed to recover the 
transmitted data reliably. 

There are seven teletext systems defined, 
as shown in Table 8.52. System B (also known 
as World System Teletext, or WST) has 
become the de facto standard and most widely 
adopted solution. 



Parameter 


System A 


System B 


System C 


System D 


625-Line Video Systems 


bit-rate (Mbps) 


6.203125 


6.9375 


5.734375 


5.6427875 


data amplitude 


67 IRE 


66 IRE 


70 IRE 


70 IRE 


data per line 


40 bytes 


45 bytes 


36 bytes 


37 bytes 


525-Line Video Systems 


bit-rate (Mbps) 


- 


5.727272 


5.727272 


5.727272 


data amplitude 


- 


70 IRE 


70 IRE 


70 IRE 


data per line 


- 


37 bytes 


36 bytes 


37 bytes 



Table 8.52. Summary of Teletext Systems and Parameters. 






VBI Data 375 



EIA-516, also referred to as NABTS (North 
American Broadcast Teletext Specification), 
was used a little in the United States, and was 
an expansion of the BT.653 525-line system C 
standard. 

Figure 8.61 illustrates the teletext data on a 
scan line. If a line normally contains a color 
burst signal, it will still be present if teletext 
data is present. The 16 bits of clock run-in (or 
clock sync) consists of alternating “l’s” and 
“0’s.” 

Figures 8.62 and 8.63 illustrate the struc- 
ture of teletext systems B and C, respectively. 

System B Teletext Overview 

Since teletext System B is the defacto tele- 
text standard, a basic overview is presented 
here. 

A teletext service typically consists of 
pages, with each page corresponding to a 
screen of information. The pages are transmit- 
ted one at a time, and after all pages have been 
transmitted, the cycle repeats, with a typical 
cycle time of about 30 seconds. However, the 
broadcaster may transmit some pages more 
frequently than others, if desired. 



The teletext service is usually based on up 
to eight magazines (allowing up to eight inde- 
pendent teletext services) , with each magazine 
containing up to 100 pages. Magazine 1 uses 
page numbers 100-199, magazine 2 uses page 
numbers 200-299, etc. Each page may also 
have sub-pages, used to extend the number of 
pages within a magazine. 

Each page contains 24 rows, with up to 40 
characters per row. A character may be a letter, 
number, symbol, or simple graphic. There are 
also control codes to select colors and other 
attributes such as blinking and double height. 

In addition to teletext information, the tele- 
text protocol may be used to transmit other 
information, such as subtitling, program deliv- 
ery control (PDC) , and private data. 

Subtitling 

Subtitling is similar to the closed caption- 
ing used in the United States. Open subtitles are 
the insertion of text directly into the picture 
prior to transmission. Closed subtitles are trans- 
mitted separately from the picture. The trans- 
mission of closed subtitles in the UK uses 
teletext page 888. In the case where multiple 
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Figure 8.61. Teletext Line Format. 
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Figure 8.62. Teletext System B Structure 
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ITU-T T. 101, ANNEX D 
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Figure 8.63. Teletext System C Structure. 
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languages are transmitted using teletext, sepa- 
rate pages are used for each language. 

Program Delivery Control (PDC) 

Program Delivery Control (defined by 
ETSI EN 300 231 and ITU-R BT.809) is a sys- 
tem that controls VCR recording using teletext 
information. The VCR can be programmed to 
look for and record various types of programs 
or a specific program. Programs are recorded 
even if the transmission time changes for any 
reason. 

There are two methods of transmitting 
PDC information via teletext: methods A and 
B. 

Method A places the data on a viewable 
teletext page, and is usually transmitted on 
scan line 16. This method is also known as the 
Video Programming System (VPS) . 

Method B places the data on a hidden 
packet (packet 26) in the teletext signal. This 
packet 26 data contains the data on each pro- 
gram, including channel, program data, and 
start time. 

Data Broadcasting 

Data broadcasting may be used to transmit 
information to private receivers. Typical appli- 
cations include real-time financial information, 
airport flight schedules for hotels and travel 
agents, passenger information for railroads, 
software upgrades, etc. 

Packets 0-23 

A typical teletext page uses 24 packets, 
numbered 0-23, that correspond to the 24 
rows on a displayed page. Packet 24 can add a 
status row at the bottom for user prompting. 
For each packet, three bits specify the maga- 
zine address (1-8), and five bits specify the 



row address (0-23). The magazine and row 
address bits are Hamming error protected to 
permit single-bit errors to be corrected. 

To save bandwidth, the whole address isn’t 
sent with all packets. Only packet 0 (also called 
the header packet) has all the address informa- 
tion such as row, page, and magazine address 
data. Packets 1-28 contain information that is 
part of the page identified by the most recent 
packet 0 of the same magazine. 

The transmission of a page starts with a 
header packet. Subsequent packets with the 
same magazine address provide additional 
data for that page. These packets may be trans- 
mitted in any order, and interleaved with pack- 
ets from other magazines. A page is 
considered complete when the next header 
packet for that magazine is received. 

The general format for packet 0 is: 



clock run-in 2 bytes 

framing code 1 byte 

magazine and row address 2 bytes 

page number 2 bytes 

subcode 4 bytes 

control codes 2 bytes 

display data 32 bytes 

The general format for packets 1-23 is: 

clock run-in 2 bytes 

framing code 1 byte 

magazine and row address 2 bytes 

display data 40 bytes 



Packet 24 

This packet defines an additional row for 
user prompting. Teletext decoders may use 
the data in packet 27 to react to prompts in the 
packet 24 display row. 
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Packet 25 

This packet defines a replacement header 
line. If present, the 40 bytes of data are dis- 
played instead of the channel, page, time, and 
date from packet 8.30. 

Packet 26 

Packet 26 consists of: 



clock run-in 2 bytes 

framing code 1 byte 

magazine and row address 2 bytes 

designation code 1 byte 



13 3-byte data groups, each consisting of 
7 data bits 
6 address bits 

5 mode bits 

6 Hamming bits 

There are 15 variations of packet 26, 
defined by the designation code. Each of the 13 
data groups specify a specific display location 
and data relating to that location. 

This packet is also used to extend the 
addressable range of the basic character set in 
order to support other languages, such as Ara- 
bic, Spanish, Hungarian, Chinese, etc. 

For PDC, packet 26 contains data for each 
program, identifying the channel, program 
date, start time, and the cursor position of the 
program information on the page. When the 
user selects a program, the cursor position is 
linked to the appropriate packet 26 preselec- 
tion data. This data is then used to program the 
VCR. When the program is transmitted, the 
program information is transmitted using 
packet 8.30 format 2. A match between the pre- 
selection data and the packet 8.30 data turns 
the VCR record mode on. 

Packet 27 

Packet 27 tells the teletext decoder how to 
respond to user selections for packet 24. There 
may be up to four packet 27s (packets 27/0 



through 27/3), allowing up to 24 links. Packet 
27 consists of: 



clock run-in 


2 bytes 


framing code 


1 byte 


magazine and row address 


2 bytes 


designation code 


1 byte 


link 1 (red) 


6 bytes 


link 2 (green) 


6 bytes 


link 3 (yellow) 


6 bytes 


link 4 (cyan) 


6 bytes 


link 5 (next page) 


6 bytes 


link 6 (index) 


6 bytes 


link control data 


1 byte 


page check digit 


2 bytes 



Each link consists of: 

7 data bits 
6 address bits 

5 mode bits 

6 hamming bits 

This packet contains information linking 
the current page to six page numbers (links) . 
The four colored links correspond to the four 
colored Fastext page request keys on the 
remote. Typically, these four keys correspond 
to four colored menu selections at the bottom 
of the display using packet 24. Selection of one 
of the colored page request keys results in the 
selection of the corresponding linked page. 

The fifth link is used for specifying a page 
the user might want to see after the current 
page, such as the next page in a sequence. 

The sixth link corresponds to the Fastext 
index key on the remote, and specifies the 
page address to go to when the index is 
selected. 

Packets 28 and 29 

These are used to define level 2 and level 3 
pages to support higher resolution graphics, 
additional colors, alternate character sets, etc. 
They are similar in structure to packet 26. 
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Packet 8.30 Format 1 

Packet 8.30 (magazine 8, packet 30) isn’t 
associated with any page, but is sent once per 
second. This packet is also known as the Tele- 
vision Service Data Packet, or TSDP It con- 
tains data that notifies the teletext decoder 
about the transmission in general and the time. 



clock run-in 


2 bytes 


framing code 


1 byte 


magazine and row address 


2 bytes 


designation code 


1 byte 


initial teletext page 


6 bytes 


network ID 


2 bytes 


time offset from UTC 


1 byte 


date (Modified Julian Day) 


3 bytes 


UTC time 


3 bytes 


TV program label 


4 bytes 


status display 


20 bytes 


The Designation Code indicates whether 



the transmission is during the VBI or full-field. 

Initial Teletext Page tells the decoder 
which page should be captured and stored on 
power-up. This is usually an index or menu 
page. 

The Network Identification code identifies 
the transmitting network. 

The TV Program Label indicates the pro- 
gram label for the current program. 

Status Display is used to display a transmis- 
sion status message. 

Packet 8.30 Format 2 

This format is used for PDC recorder con- 
trol, and is transmitted once per second per 
stream. It contains a program label indicating 
the start of each program, usually transmitted 



about 30 seconds before the start of the pro- 
gram to allow the VCR to detect it and get 
ready to record. 



clock run-in 


2 bytes 


framing code 


1 byte 


magazine and row address 


2 bytes 


designation code 


1 byte 


initial teletext page 


6 bytes 


label channel ID 


1 byte 


program control status 


1 byte 


country and network ID 


2 bytes 


program ID label 


5 bytes 


country and network ID 


2 bytes 


program type 


2 bytes 


status display 


20 bytes 


The content is the same 


as for Format 1, 



except for the 13 bytes of information before 
the status display information. 

Label channel ID (LCI) identifies each of 
up to four PDC streams that may be transmit- 
ted simultaneously. 

The Program Control Status (PCS) indi- 
cates real-time status information, such as the 
type of analog sound transmission. 

The Country and Network ID (CNI) is split 
into two groups. The first part specifies the 
country and the second part specifies the net- 
work. 

Program ID Label (PIL) specifies the 
month, day, and local time of the start of the 
program. 

Program Type (PTY) is a code that indi- 
cates an intended audience or a particular 
series. Examples are “adult,” “children,” 
“music,” “drama,” etc. 
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Packet 31 

Packet 31 is used for the transmission of 
data to private receivers. It consists of: 



clock run-in 


2 bytes 


framing code 


1 byte 


data channel group 


1 byte 


message bits 


1 byte 


format type 


1 byte 


address length 


1 byte 


address 


0-6 bytes 


repeat indicator 


0-1 byte 


continuity indicator 


0-1 byte 


data length 


0-1 byte 


user data 


28-36 bytes 


CRC 


2 bytes 



AMOL (Automated Measurement of 
Lineups) 

AMOL I 

Lines 20, 22, 283, and/or 284 are used to 
transmit the AMOL I information, as shown in 
Figure 8.64. However, it may reside on any 
480i VBI line, due to unintentional shifting 
caused by editing, compression, etc. The 1 
Mbps payload may change as often as every 
frame. 

Each of the 48 data bits is 1000 ±100 ns 
wide with a maximum rise and fall time of 300 
ns. A logical “1” has an amplitude of 55 ±5 IRE; 
a logical “0” has an amplitude of 0-10 IRE. 

AMOL II 

Lines 20, 22, 283, and/or 284 are used to 
transmit the AMOL II information, as shown in 
Figure 8.65. However, it may reside on any 
480i VBI line, due to unintentional shifting 
caused by editing, compression, etc. The 2 
Mbps payload may change as often as every 
frame. 



Each of the 96 data bits is 500 ±50 ns wide 
with a maximum rise and fall time of 150 ns. A 
logical “1” has an amplitude of 55 ±5 IRE; a log- 
ical “0” has an amplitude of 0-10 IRE. 

Raw VBI Data 

Raw, or oversampled, VBI data is simply 
digitized VBI data. It is typically oversampled 
using a 2x video sample clock, such as 27 MHz 
for 480i and 54 MHz for 480p video. Use of the 
2x video sample clock enables transferring the 
raw VBI data over a standard 8-bit BT.656 inter- 
face. VBI data may be present on any scan line, 
except during the serration and equalization 
intervals. 

The raw VBI data is then converted to 
binary (or sliced) data and processed and/or 
passed through to the composite, S-video, and 
YPbPr analog video outputs so it may be 
decoded by the TV. 

In the conversion from raw to sliced VBI 
data, the VBI decoders must compensate for 
varying DC offsets, amplitude variations, 
ghosting, and timing variations. 

Hysteresis must also be used to prevent 
the VBI decoders from turning on and off rap- 
idly due to noise and transmission errors. 
Once the desired VBI signal is found for (typi- 
cally) 15 consecutive frames, VBI decoding 
should commence. When the desired VBI sig- 
nal is not found on the appropriate scan lines 
for (typically) 45 consecutive frames, VBI 
decoding should stop. 

Sliced VBI Data 

Sliced, or binary, VBI data is commonly 
available from NTSC/PAL video decoders. 
This has the advantage of lower data rates 
since binary, rather than oversampled, data is 
present. The primary disadvantage is the vari- 
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ety of techniques NTSC/PAL video decoder 
chip manufacturers use to transfer the sliced 
VBI data over the video interface. 

NTSC/PAL Decoder Considerations 

Closed Captioning 

In addition to caption and text commands 
that clear the display, five other events typically 
force the display to be cleared: 

(1) A change in the caption display mode, 
such as switching from CC1 to Tl. 

(2) A loss of video lock, such as on a channel 
change, forces the display to be cleared. The 
currently active display mode does not 
change. For example, if CC1 was selected 
before loss of video lock, it remains selected. 

(3) Activation of autoblanking. If the caption 
signal has not been detected for (typically) 15 
consecutive frames, or no new data for the 
selected channel has been received for (typi- 
cally) 45 frames, the display memory is 
cleared. Once the caption signal has been 
detected for (typically) 15 consecutive frames, 
or new data has been received, it is displayed. 

(4) A clear command (from the remote con- 
trol for example) forces the display to be 
cleared. 

(5) Disabling caption decoding also forces the 
display to be cleared. 



Widescreen Signaling 

The decoder must be able to handle a vari- 
ety of WSS inputs including: 

(1) PAL or NTSC WSS signal on composite, S- 
video, or Y (ofYPbPr). 

(2) SCART analog inputs (DC offset indicator) 



(3) S-video analog inputs (DC offset indicator) 

In addition to automatically processing the 
video signal to fit a 4:3 or 16:9 display based on 
the WSS data, the decoder should also support 
manual overrides in case the user wishes a 
specific mode of operation due to personal 
preferences. Software uses this aspect ratio 
information, user preferences, and display for- 
mat to assist in properly processing the pro- 
gram for display. 

Ghost Cancellation 

Ghost cancellation (the removal of undes- 
ired reflections present in the signal) is 
required due to the high data rate of some ser- 
vices, such as teletext. Ghosting greater than 
100 ns and -12 dB corrupts teletext data. 
Ghosting greater than -3 dB is difficult to 
remove cost-effectively in hardware or soft- 
ware, while ghosting less than -12 dB need not 
be removed. Ghost cancellation for VBI data is 
not as complex as ghost cancellation for active 
video. 

Unfortunately, the GCR (ghost cancella- 
tion reference) signal is not usually present. 
Thus, a ghost cancellation algorithm must 
determine the amount of ghosting using other 
available signals, such as the serration and 
equalization pulses. 

The NTSC GCR signal is specified in ATSC 
A/49 and ITU-R BT.1124. If present, it occupies 
lines 19 and 282. The GCR permits the detec- 
tion of ghosting from -3 to +45 ps, and follows 
an 8-field sequence. 

The PAL GCR signal is specified in 
BT.1124 and ETSI ETS 300 732. If present, it 
occupies line 318. The GCR permits the detec- 
tion of ghosting from -3 to +45 ps, and follows 
a 4-frame sequence. 
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Enhanced Television 
Programming 

The enhanced television programming 
standard (SMPTE 363M) is used for creating 
and delivering enhanced and interactive pro- 
grams. The enhanced content can be delivered 
over a variety of mediums — including analog 
and digital television broadcasts — using terres- 
trial, cable, and satellite networks. In defining 
how to create enhanced content, the specifica- 
tion defines the minimum receiver functional- 
ity. To minimize the creation of new 
specifications, it leverages Internet technolo- 
gies such as HTML and Java-script. The bene- 
fits of doing this are that there are already 
millions of pages of potential content, and the 
ability to use existing web-authoring tools. 

The specification mandates that receivers 
support, as a minimum, HTML 4.0, Javascript 
1.1, and Cascading Style Sheets. Supporting 
additional capabilities, such as Java and VRML, 
is optional. This ensures content is available to 
the maximum number of viewers. 

For increased capability, a new “tv:” 
attribute is added to the HTML. This attribute 
enables the insertion of the television program 
into the content, and may be used in an HTML 
document anywhere that a regular image may 
be placed. Creating an enhanced content page 
that displays the current television channel 
anywhere on the display is as easy as inserting 
an image in an HTML document. 

The specification also defines how the 
receivers obtain the content and how they are 
informed that enhancements are available. The 
latter task is accomplished with triggers. 

Triggers 

Triggers alert receivers to content 
enhancements, and contain information about 



the enhancements. Among other things, trig- 
gers contain a universal resource locator 
(URL) that defines the location of the 
enhanced content. Content may reside 
locally — such as when delivered over the net- 
work and cached to a local hard drive — or it 
may reside on the Internet or another network. 

Triggers may also contain a human-read- 
able description of the content. For example, it 
may contain the description “Press ORDER to 
order this product,” which can be displayed for 
the viewer. Triggers also may contain expira- 
tion information, indicating how long the 
enhancement should be offered to the viewer. 

Lastly, triggers may contain scripts that 
trigger the execution of Javascript within the 
associated HTML page, to support synchroni- 
zation of the enhanced content with the video 
signal and updating of dynamic screen data. 

The processing of triggers is defined in 
SMPTE 363M and is independent of the 
method used to carry them. 

Transports 

Besides defining how content is displayed 
and how the receiver is notified of new content, 
the specification also defines how content is 
delivered. Because a receiver may not have an 
Internet connection, the specification 
describes two models for delivering content. 
These two models are called transports, and 
the two transports are referred to as Transport 
Type A and Transport Type B. 

If the receiver has a back-channel (or 
return path) to the Internet, Transport Type A 
will broadcast the trigger and the content will 
be pulled over the Internet. 

If the receiver does not have an Internet 
connection, Transport Type B provides for 
delivery of both triggers and content via the 
broadcast medium. Announcements are sent 
over the network to associate triggers with 
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content streams. An announcement describes 
the content, and may include information 
regarding bandwidth, storage requirements, 
and language. 

Delivery Protocols 

For traditional bi-directional Internet com- 
munication, the Hypertext Transfer Protocol 
(HTTP) defines how data is transferred at the 
application level. For uni-directional broad- 
casts where a two-way connection is not avail- 
able, SMPTE 364M defines a uni-directional 
application-level protocol for data delivery: 
Uni-directional Hypertext Transfer Protocol 
(UHTTP). 

Like HTTP, UHTTP uses traditional URL 
naming schemes to reference content. Content 
can reference enhancement pages using the 
standard “http:” and “ftp:” naming schemes. A 
“lid:,” or local identifier, URL is also available to 
allow reference to content that exists locally 
(such as on the receiver’s hard drive) as 
opposed to on the Internet or other network. 

Bindings 

How data is delivered over a specific net- 
work is called “binding.” Bindings have been 
defined for NTSC and PAL. 

NTSC Bindings 

Transport Type A triggers are broadcast 
on data channel 2 of the CEA-608 captioning 
signal. 

Transport Type B binding also includes a 
mechanism for delivering IP multicast packets 
over the vertical blanking interval (VBI) , oth- 
erwise known as IP over VBI (IP/VBI) . At the 
lowest level, the television signal transports 
NABTS (North American Basic Teletext Stan- 
dard) packets during the VBI. These NABTS 
packets are recovered to form a sequential 



data stream (encapsulated in a SLIP-like proto- 
col) that is unframed to produce IP packets. 

PAL Bindings 

Both transport types are based on carriage 
of IP multicast packets in VBI lines of a PAL 
system by means of teletext packets 30 or 31. 

Transport Type A triggers are carried in 
UDP/IP multicast packets, delivered to 
address 224.0.23.13 and port 2670. 

Transport Type B (described in SMPTE 
357M) carries a single trigger in a single 
UDP/IP multicast packet, delivered on the 
address and port defined in the SDP announce- 
ment for the enhanced television program. The 
trigger protocol is very lightweight in order to 
provide quick synchronization. 
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Chapter 9 



NTSC and PAL Digital 
Encoding and Decoding 



Although not exactly digital video, the 
NTSC and PAL composite color video formats 
are currently the most common formats for 
video. Although the video signals themselves 
are analog, they can be encoded and decoded 
almost entirely digitally. 

Analog NTSC and PAL encoders and 
decoders have been available for some time. 
However, they have been difficult to use, 
required adjustment, and offered limited video 
quality. Using digital techniques to implement 
NTSC and PAL encoding and decoding offers 
many advantages such as ease of use, mini- 
mum analog adjustments, and excellent video 
quality. 



In addition to composite video, S-video is 
supported by consumer and pro-video equip- 
ment, and should also be implemented. S-video 
uses separate luminance (Y) and chrominance 
(C) analog video signals so higher quality may 
be maintained by eliminating the Y/C separa- 
tion process. 

This chapter discusses the design of a digi- 
tal encoder (Figure 9.1) and decoder (Figure 
9.21) that support composite and S-video (M) 
NTSC and (B, D, G, H, I, N c ) PAL video sig- 
nals. (M) and (N) PAL are easily accommo- 
dated with some slight modifications. 

NTSC encoders and decoders are usually 
based on the YCbCr, YUV, or YIQ color space. 
PAL encoders and decoders are usually based 
on the YCbCr or YUV color space. 
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Video Standard 


Sample Clock 
Rate 


Applications 


Active 

Resolution 


Total 

Resolution 


Field 

Rate 

(per second) 




9 MHz 


SVCD 


480 x 480i 


572 x 525i 




(M) NTSC, 
(M) PAL 




BT.601 


720 1 x 480i 






13.5 MHz 


MPEG-2 


704 x 480i 


858 x 525i 


59.94 

interlaced 




DV 


720 x 480i 






12.27 MHz 


square pixels 


640 x 480i 


780 x 525i 






9 MHz 


SVCD 


480 x 576i 


576 x 625i 




(B, D, G, H, I, N, N c ) 
PAL 


14.75 MHz 


square pixels 


768 x 576i 


944 x 625i 






BT.601 


720 2 x 576i 




50 

interlaced 


13.5 MHz 


MPEG-2 


704 x 576i 


864 x 625i 






DV 


720 x 576i 







Table 9.1. Common NTSC/PAL Sample Rates and Resolutions, typically 716 true active 
samples between 10% blanking points. 2 Typically 702 true active samples between 50% 
blanking points. 



NTSC and PAL Encoding 

YCbCr input data has a nominal range of 
16-235 for Y and 16-240 for Cb and Cr. RGB 
input data has a range of 0-255; pro-video appli- 
cations may use a nominal range of 16-235. 

As YCbCr values outside these ranges 
result in overflowing the standard YIQ or YUV 
ranges for some color combinations, one of 
three things may be done, in order of prefer- 
ence: (a) allow the video signal to be generated 
using the extended YIQ or YUV ranges; (b) 
limit the color saturation to ensure a legal 
video signal is generated; or (c) clip the YIQ or 
YUV levels to the valid ranges. 

4:1:1, 4:2:0, or 4:2:2 YCbCr data must be 
converted to 4:4:4 YCbCr data before being 
converted to YIQ or YUV data. The chromi- 
nance lowpass filters will not perform the inter- 
polation properly. 

Table 9.1 lists some of the common sample 
rates and resolutions. 



2x Oversampling 

2x oversampling generates 8:8:8 YCbCr or 
RGB data, simplifying the analog output filters. 
The oversampler is also a convenient place to 
convert from 8-bit to 10-bit data, providing an 
increase in video quality. 

Color Space Conversion 

Choosing the 10-bit video levels to be 
white = 800 and sync = 16, and knowing that 
the sync-to-white amplitude is IV, the full-scale 
output of the D/A converters (DACs) is there- 
fore set to 1.305V. 

(M) NTSC, (M, N) PAL 

Since (M) NTSC and (M, N) PAL have a 
7.5 IRE blanking pedestal and a 40 IRE sync 
amplitude, the color space conversion equa- 
tions are derived so as to generate 0.660V of 
active video. 
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Figure 9.1. Typical NTSC/PAL Digital Encoder Implementation. 
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YUV Color Space Processing 

Modern encoder designs are now based 
on the YUV color space. For these encoders, 
the YCbCr to YUV equations are: 

Y = 0.591 (Y 601 - 64) 

U = 0.504 (Cb - 512) 

V= 0.711 (Cr- 512) 

The R G B ' to YUV equations are: 

Y = 0.151R' + 0.297G' + 0.058B' 

U = -0.074R' - 0.147G' + 0.221B' 

Y = 0.312R' - 0.261G' - 0.051B' 

For pro-video applications using a 10-bit nomi- 
nal range of 64-940 for RGB, the RGB' to 
YUV equations are: 

Y = 0.177(R' - 64) + 0.347 (G' - 64) + 

0.067 (B' - 64) 

U = -0.087 (R' - 64) - 0.171(G' - 64) + 

0.258 (B' - 64) 

Y = 0.364 (R' - 64) - 0.305 (G' - 64) - 

0.059 (B' - 64) 

Y has a nominal range of 0 to 518, U a nom- 
inal range of 0 to +226, and V a nominal range 
of 0 to +319. Negative values of Y should be 
supported to allow test signals, keying infor- 
mation, and real-world video to be passed 
through the encoder with minimum corrup- 
tion. 

YIQ Color Space Processing 

For older NTSC encoder designs based on 
the YIQ color space, the YCbCr to YIQ equa- 
tions are: 



Y = 0.591 (Y 601 - 64) 

I = 0.596 (Cr - 512) - 0.274 (Cb - 512) 

Q = 0.387 (Cr - 512) + 0.423 (Cb - 512) 

The R'G'B ' to YIQ equations are: 

Y = 0.151R' + 0.297G' + 0.058B' 

I = 0.302R' - 0.139G' - 0.163B' 

Q = 0.107R' - 0.265G' + 0.158B' 

For pro-video applications using a 10-bit nomi- 
nal range of 64-940 for R G B', the R G B' to 
YIQ equations are: 

Y = 0.177(R' - 64) + 0.347 (G' - 64) + 

0.067 (B' - 64) 

I = 0.352 (R' - 64) - 0.162(G' - 64) - 
0.190(B' - 64) 

Q = 0.125 (R' - 64) - 0.309 (G' - 64) + 
0.184(B' - 64) 

Y has a nominal range of 0 to 518, 1 a nomi- 
nal range of 0 to +309, and Q a nominal range 
of 0 to +271. Negative values of Y should be 
supported to allow test signals, keying infor- 
mation, and real-world video to be passed 
through the encoder with minimum corrup- 
tion. 

YCbCr Color Space Processing 

If the design is based on the YUV color 
space, the Cb and Cr conversion to U and V 
may be avoided by scaling the sin and cos val- 
ues during the modulation process or scaling 
the color difference lowpass filter coefficients. 
This has the advantage of reducing data path 
processing. 
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NTSC-J 

Since the version of (M) NTSC used in 
Japan has a 0 IRE blanking pedestal, the color 
space conversion equations are derived so as 
to generate 0.714V of active video. 

YUV Color Space Processing 

The YCbCr to YUV equations are: 

Y = 0.639 (Ygoi - 64) 

U = 0.545 (Cb - 512) 

Y = 0.769(0-512) 

The R G B' to YUV equations are: 

Y = 0.164R' + 0.321G' + 0.062B' 

U = -0.080R' - 0.159G' + 0.239B' 

Y = 0.337R' - 0.282G' - 0.055B' 

For pro-video applications using a 10-bit nomi- 
nal range of 64-940 for R G B', the R G B' to 
YUV equations are: 

Y = 0.191(R' - 64) + 0.375(G' - 64) + 

0.073 (B' - 64) 

U = -0.094 (R' - 64) - 0.185(G' - 64) + 

0.279 (B' - 64) 

Y = 0.393 (R- 64) - 0.329(G' - 64) - 

0.064 (B' - 64) 

Y has a nominal range of 0 to 560, U a nom- 
inal range of 0 to +244, and V a nominal range 
of 0 to +344. Negative values of Y should be 
supported to allow test signals, keying infor- 
mation, and real-world video to be passed 
through the encoder with minimum corrup- 
tion. 

YIQ Color Space Processing 

For older encoder designs based on the 
YIQ color space, the YCbCr to YIQ equations 



Y = 0.639(Y 601 - 64) 

I = 0.645 (Cr - 512) - 0.297 (Cb - 512) 

Q = 0.419 (Cr - 512) + 0.457 (Cb - 512) 

The R G B ' to YIQ equations are: 

Y = 0.164R' + 0.321G' + 0.062B' 

I = 0.326R' - 0.150G' - 0.176B' 

Q = 0.116R' - 0.286G' + 0.170B' 

For pro-video applications using a 10-bit nomi- 
nal range of 64-940 for R G B', the R G B' to 
YIQ equations are: 

Y = 0.191(R' - 64) + 0.375(G' - 64) + 

0.073 (B' - 64) 

I = 0.381 (R' - 64) - 0.176(G' - 64) - 
0.205 (B' - 64) 

Q = 0.135 (R' - 64) - 0.334 (G' - 64) + 
0.199(B' - 64) 

Y has a nominal range of 0 to 560, 1 a nomi- 
nal range of 0 to +334, and Q a nominal range 
of 0 to +293. Negative values of Y should be 
supported to allow test signals, keying infor- 
mation, and real-world video to be passed 
through the encoder with minimum corrup- 
tion. 

YCbCr Color Space Processing 

If the design is based on the YUV color 
space, the Cb and Cr conversion to U and V 
may be avoided by scaling the sin and cos val- 
ues during the modulation process or scaling 
the color difference lowpass filter coefficients. 
This has the advantage of reducing data path 
processing. 



are: 




NTSC and PAL Encoding 393 



(B, D, G, H, I, N c ) PAL 

Since these PAL standards have a 0 IRE 
blanking pedestal and a 43 IRE sync amplitude, 
the color space conversion equations are 
derived so as to generate 0.7V of active video. 

YUV Color Space Processing 

The YCbCr to YUV equations are: 

Y = 0.625 (Ygoi - 64) 

U = 0.533 (Cb - 512) 

Y = 0.752(Cr-512) 

The R G B ' to YUV equations are: 

Y = 0.160R' + 0.314G' + 0.061B' 

U = -0.079R' - 0.155G' + 0.234B' 

Y = 0.329R' - 0.275G' - 0.054B' 

For pro-video applications using a 10-bit nomi- 
nal range of 64-940 for R G B', the R G B' to 
YUV equations are: 

Y = 0.187(R' - 64) + 0.367 (G' - 64) + 

0.071 (B' - 64) 

U = -0.092 (R' - 64) - 0.181(G' - 64) + 

0.273 (B' - 64) 

Y = 0.385 (R' - 64) - 0.322 (G' - 64) - 

0.063 (B' - 64) 

Y has a nominal range of 0 to 548, U a nom- 
inal range of 0 to +239, and V a nominal range 
of 0 to +337. Negative values of Y should be 
supported to allow test signals, keying infor- 
mation, and real-world video to be passed 
through the encoder with minimum corrup- 
tion. 

YCbCr Color Space Processing 

If the design is based on the YUV color 
space, the Cb and Cr conversion to U and V 
may be avoided by scaling the sin and cos val- 



ues during the modulation process or scaling 
the color difference lowpass filter coefficients. 
This has the advantage of reducing data path 
processing. 

Luminance (Y) Processing 

Lowpass filtering to about 6 MHz must be 
done to remove high-frequency components 
generated as a result of the 2x oversampling 
process. 

An optional notch filter may also be used to 
remove the color subcarrier frequency from 
the luminance information. This improves 
decoded video quality for decoders that use 
simple Y/C separation. The notch filter should 
be disabled when generating S-video, RGB, or 
YPbPr video signals. 

Next, any blanking pedestal is added dur- 
ing active video, and the blanking and sync 
information is added. 

(M) NTSC, (M, N) PAL 

As (M) NTSC and (M, N) PAL have a 7.5 
IRE blanking pedestal, a value of 42 is added to 
the luminance data during active video. 0 is 
added during the blank time. 

After the blanking pedestal is added, the 
luminance data is clamped by a blanking signal 
that has a raised cosine distribution to slow the 
slew rate of the start and end of the video sig- 
nal. Typical blank rise and fall times are 140 
+20 ns for NTSC and 300 +100 ns for PAL. 

Digital composite sync information is 
added to the luminance data after the blank 
processing has been performed. Values of 16 
(sync present) or 240 (no sync) are assigned. 
The sync rise and fall times should be pro- 
cessed to generate a raised cosine distribution 
(between 16 and 240) to slow the slew rate of 
the sync signal. Typical sync rise and fall times 
are 140 +20 ns for NTSC and 250 +50 ns for 
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PAL, although the encoder should generate 
sync edges of about 130 or 240 ns to compen- 
sate for the analog output filters slowing the 
sync edges. 

At this point, we have digital luminance 
with sync and blanking information, as shown 
in Table 9.2. 

NTSC-J 

When generating NTSC-J video, there is a 
0 IRE blanking pedestal. Thus, no blanking 
pedestal is added to the luminance data during 
active video. Otherwise, the processing is the 
same as for (M) NTSC. 

(B, D, G, H, I, N c ) PAL 

When generating (B, D, G, H, I, Nc) PAL 
video, there is a 0 IRE blanking pedestal. Thus, 
no blanking pedestal is added to the luminance 
data during active video. 

Blanking information is inserted using the 
same technique as used for (M) NTSC. How- 
ever, typical blank rise and fall times are 300 
+100 ns. 

Composite sync information is added 
using the same technique as used for (M) 
NTSC, except values of 16 (sync present) or 
252 (no sync) are used. Typical sync rise and 
fall times are 250 +50 ns, although the encoder 
should generate sync edges of about 240 ns to 



compensate for the analog output filters slow- 
ing the sync edges. 

At this point, we have digital luminance 
with sync and blanking information, as shown 
in Table 9.2. 

Analog Luminance (Y) Generation 

The digital luminance data may drive a 10- 
bit DAC that generates a 0-1. 305V output to 
generate the Y video signal of an S-video (Y/ C) 
interface. 

Figures 9.2 and 9.3 show the luminance 
video waveforms for 75% color bars. The num- 
bers on the luminance levels indicate the data 
value for a 10-bit DAC with a full-scale output 
value of 1.305V. The video signal at the connec- 
tor should have a source impedance of 75 fl 

As the sample-and-hold action of the DAC 
introduces a (sin x)/x characteristic, the video 
data may be digitally filtered by a [(sin x)/x] _1 
filter to compensate. Alternately, as an analog 
lowpass filter is usually present after the DAC, 
the correction may take place in the analog fil- 
ter. 

As an option, the ability to delay the digital 
Y information a programmable number of 
clock cycles before driving the DAC may be 
useful. If the analog luminance video is low- 
pass filtered after the DAC, and the analog 
chrominance video is bandpass filtered after its 



Video 

Level 


(M) NTSC 


NTSC-J 


(B, D, G, H, 1, N c ) 

PAL 


(M, N) PAL 


white 


800 


800 


800 


800 


black 


282 


240 


252 


282 


blank 


240 


240 


252 


240 


sync 


16 


16 


16 


16 



Table 9.2. 10-Bit Digital Luminance Values. 
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Figure 9.2. (M) NTSC Luminance (Y) Video Signal for 75% Color Bars. Indicated luminance levels 
are 10-bit values. 
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Figure 9.3. (B, D, G, H, I) PAL Luminance (Y) Video Signal for 75% Color Bars. Indicated luminance 
levels are 10-bit values. 
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DAC, the chrominance video path may have a 
longer delay (typically up to about 400 ns) than 
the luminance video path. By adjusting the 
delay of the Y data, the analog luminance and 
chrominance video will be aligned more 
closely after filtering, simplifying the analog 
design. 

Color Difference Processing 

Lowpass Filtering 

The color difference signals (CbCr, UV, or 
IQ) should be lowpass filtered using a Gauss- 
ian filter. This filter type minimizes ringing and 
overshoot, avoiding the generation of visual 
artifacts on sharp edges. 

If the encoder is used in a video editing 
application, the filters should have a maximum 
ripple of +0.1 dB in the passband. This mini- 
mizes the cumulation of gain and loss artifacts 
due to the filters, especially when multiple 
passes through the encoding and decoding 
processes are done. At the final encoding 
point, Gaussian filters may be used. 

YCbCr and YUV Color Space 

Cb and Cr, or U and Y, are lowpass filtered 
to about 1.3 MHz. Typical filter characteristics 
are <2 dB attenuation at 1.3 MHz and >20 dB 
attenuation at 3.6 MHz. The filter characteris- 
tics are shown in Figure 9.4. 

YIQ Color Space 

Q is lowpass filtered to about 0.6 MHz. 
Typical filter characteristics are <2 dB attenua- 
tion at 0.4 MHz, <6 dB attenuation at 0.5 MHz, 
and >6 dB attenuation at 0.6 MHz. The filter 
characteristics are shown in Figure 9.5. 

Typical filter characteristics for I are the 
same as for U and V. 



Filter Considerations 

The modulation process is shown in spec- 
tral terms in Figures 9.6 through 9.9. The fre- 
quency spectra of the modulation process are 
the same as those if the modulation process 
were analog, but are repeated at harmonics of 
the sample rate. 

Using wide-band (1.3 MHz) filters, the 
modulated chrominance spectra overlap near 
the zero frequency regions, resulting in alias- 
ing. Also, there may be considerable aliasing 
just above the subcarrier frequency. For these 
reasons, the use of narrower-band lowpass fil- 
ters (0.6 MHz) may be more appropriate. 

Wide-band Gaussian filters ensure opti- 
mum compatibility with monochrome displays 
by minimizing the artifacts at the edges of col- 
ored objects. A narrower, sharper-cut lowpass 
filter would emphasize the subcarrier signal at 
these edges, resulting in ringing. If mono- 
chrome compatibility can be ignored, a benefi- 
cial effect of narrower filters would be to 
reduce the spread of the chrominance into the 
low-frequency luminance (resulting in low-fre- 
quency cross-luminance) , which is difficult to 
suppress in a decoder. 

Also, although the encoder may maintain a 
wide chrominance bandwidth, the bandwidth 
of the color difference signals in a decoder is 
usually much narrower. In the decoder, loss of 
the chrominance upper sidebands (due to low- 
pass filtering the video signal to 4.2-5.5 MHz) 
contributes to ringing and color difference 
crosstalk on color transitions. Any increase in 
the decoder chrominance bandwidth causes a 
proportionate increase in cross-color. 
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Figure 9.4. Typical 1.3 MHz Lowpass Digital 
Filter Characteristics. 



Figure 9.5. Typical 0.6 MHz Lowpass Digital 
Filter Characteristics. 
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Figure 9.6. Frequency Spectra for NTSC Digital Chrominance Modulation (F s = 13.5 MHz, F sc = 
3.58 MHz). (A) Lowpass filtered U and V signals. (B) Color subcarrier. (C) Modulated chrominance 
spectrum produced by convolving (A) and (B). 
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Figure 9.7. Frequency Spectra for NTSC Digital Chrominance Modulation (F s = 12.27 MHz, F sc = 
3.58 MHz). (A) Lowpass filtered U and V signals. (B) Color subcarrier. (C) Modulated chrominance 
spectrum produced by convolving (A) and (B). 
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Figure 9.8. Frequency Spectra for PAL Digital Chrominance Modulation (F s = 13.5 MHz, F sc = 
4.43 MHz). (A) Lowpass filtered U and V signals. (B) Color subcarrier. (C) Modulated chrominance 
spectrum produced by convolving (A) and (B). 
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Figure 9.9. Frequency Spectra for PAL Digital Chrominance Modulation (F s = 14.75 MHz, F sc = 
4.43 MHz). (A) Lowpass filtered U and V signals. (B) Color subcarrier. (C) Modulated chrominance 
spectrum produced by convolving (A) and (B). 
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Chrominance (C) Modulation 

(M) NTSC, NTSC-J 

During active video, the CbCr, UV, or IQ 
data modulate sin and cos subcarriers, as 
shown in Figure 9.1, resulting in digital 
chrominance (C) data. For this design, the li- 
bit reference subcarrier phase (see Figure 
9.17) and the burst phase are the same (180°). 

For YUV and YCbCr processing, 180° 
must be added to the 11-bit reference subcar- 
rier phase during active video time so the out- 
put of the sin and cos ROMs have the proper 
subcarrier phases (0° and 90°, respectively). 

For YIQ processing, 213° must be added to 
the 11-bit reference subcarrier phase during 
active video time so the output of the sin and 
cos ROMs have the proper subcarrier phases 
(33° and 123°, respectively). 

For the following equations, 

co = 2itF sc 

F sc = 3.579545 MHz (±10 Hz) 

YUV Color Space 

As discussed in Chapter 8, the chromi- 
nance signal may be represented by: 

(U sin cot) + (V cos cot) 

Chrominance amplitudes are +sqrt(U 2 + V 2 ). 
YCbCr Color Space 

If the encoder is based on the YCbCr color 
space, the chrominance signal may be repre- 
sented by: 

(Cb- 512) (0.504) (sin cot) + 

(Cr- 512) (0.711) (cos cot) 



For NTSC-J systems, the equations are: 

(Cb- 512) (0.545) (sin cot) + 

(Cr- 512) (0.769) (cos cot) 

In these cases, the values in the sin and 
cos ROMs are scaled by the indicated values to 
allow the modulator multipliers to accept Cb 
and Cr data directly, instead of U and V data. 

YIQ Color Space 

As discussed in Chapter 8, the chromi- 
nance signal may also be represented by: 

(Q sin (cot + 33°)) + (I cos (cot + 33°)) 
Chrominance amplitudes are +sqrt(I 2 + Q 2 ). 

(B, D, G, H, I, M, N, NC) PAL 

During active video, the CbCr or UV data 
modulate sin and cos subcarriers, as shown in 
Figure 9.1, resulting in digital chrominance 
(C) data. For this design, the 11-bit reference 
subcarrier phase (see Figure 9.17) is 135°. 

For the following equations, 

03 = 2jtFsc 

F sc = 4.43361875 MHz (±5 Hz) 
for (B, D, G, H, I, N) PAL 

F sc = 3.58205625 MHz (±5 Hz) for (N c ) PAL 

F sc = 3.57561149 MHz (±5 Hz) for (M) PAL 

PAL Switch 

In theory, since the [sin cot] and [cos cot] 
subcarriers are orthogonal, the U and V sig- 
nals can be perfectly separated from each 
other in the decoder. However, if the video sig- 
nal is subjected to distortion, such as asymmet- 
rical attenuation of the sidebands due to 
lowpass filtering, the orthogonality is 
degraded, resulting in crosstalk between the U 
and V signals. 
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PAL uses alternate line switching of the V 
signal to provide a frequency offset between 
the U and V subcarriers, in addition to the 90° 
subcarrier phase offset. When decoded, 
crosstalk components appear modulated onto 
the alternate line carrier frequency, in solid 
color areas producing a moving pattern known 
as Hanover bars. This pattern may be sup- 
pressed in the decoder by a comb filter that 
averages equal contributions from switched 
and unswitched lines. 

When PAL switch = 0, the 11-bit reference 
subcarrier phase (see Figure 9.17) and the 
burst phase are the same (135°). Thus, 225° 
must be added to the 11-bit reference subcar- 
rier phase during active video so the output of 
the sin and cos ROMs have the proper subcar- 
rier phases (0° and 90°, respectively). 

When PAL switch = 1, 90° is added to the 
11-bit reference subcarrier phase, resulting in 
a 225° burst phase. Thus, an additional 135° 
must be added to the 11-bit reference subcar- 
rier phase during active video so the output of 
the sin and cos ROMs have the proper phases 
(0° and 90°, respectively). 

Note that in Figure 9.17, while PAL switch 
= 1, the -V subcarrier is generated, implement- 
ing the -V component. 

YUV Color Space 

As discussed in Chapter 8, the chromi- 
nance signal is represented by: 

(U sin cot) ± (V cos cot) 

with the sign of V alternating from one line to 
the next (known as the PAL switch) . 

Chrominance amplitudes are +sqrt(U 2 + 

V 2 ). 



YCbCr Color Space 

If the encoder is based on the YCbCr color 
space, the chrominance signal for (B, D, G, H, 
I, N c ) PAL may be represented by: 

(Cb- 512) (0.533) (sin cot) ± 

(Cr -512) (0.752) (cos cot) 

The chrominance signal for (M, N) PAL may 
be represented by: 

(Cb- 512) (0.504) (sin cot) ± 

(Cr- 512) (0.711) (cos cot) 

In these cases, the values in the sin and 
cos ROMs are scaled by the indicated values to 
allow the modulator multipliers to accept Cb 
and Cr data directly, instead of U and V data. 

General Processing 

The subcarrier sin and cos values should 
have a minimum of nine bits plus sign of accu- 
racy. The modulation multipliers must have 
saturation logic on the outputs to ensure over- 
flow and underflow conditions are saturated to 
the maximum and minimum values, respec- 
tively. 

After the modulated color difference sig- 
nals are added together, the result is rounded 
to nine bits plus sign. At this point, the digital 
modulated chrominance has the ranges shown 
in Table 9.3. The resulting digital chrominance 
data is clamped by a blanking signal that has 
the same raised cosine values and timing as 
the signal used to blank the luminance data. 
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Burst Generation 

As shown in Figure 9.1, the lowpass fil- 
tered color difference data are multiplexed 
with the color burst envelope information. Dur- 
ing the color burst time, the color difference 
data should be ignored and the burst envelope 
signal inserted on the Cb, U, or Q channel (the 
Cr, V, or I channel is forced to zero) . 

The burst envelope rise and fall times 
should generate a raised cosine distribution to 
slow the slew rate of the burst envelope. Typi- 
cal burst envelope rise and fall times are 300 
+100 ns. 

The burst envelope should be wide enough 
to generate nine or ten cycles of burst informa- 
tion with an amplitude of 50% or greater. When 
the burst envelope signal is multiplied by the 
output of the sin ROM, the color burst is gener- 
ated and will have the range shown in Table 
9.3. 

For pro-video applications, the phase of the 
color burst should be programmable over a 0° 
to 360° range to provide optional system phase 
matching with external video signals. This can 
be done by adding a programmable value to 
the 11-bit subcarrier reference phase during 
the burst time (see Figure 9.17). 



Analog Chrominance (C) Generation 

The digital chrominance data may drive a 
10-bit DAC that generates a 0-1.305V output to 
generate the C video signal of an S-video (Y/C) 
interface. The video signal at the connector 
should have a source impedance of 75 fl 

Figures 9.10 and 9.11 show the modulated 
chrominance video waveforms for 75% color 
bars. The numbers in parentheses indicate the 
data value for a 10-bit DAC with a full-scale out- 
put value of 1.305V. If the DAC can’t handle the 
generation of bipolar video signals, an offset 
must be added to the chrominance data (and 
the sign information dropped) before driving 
the DAC. In this instance, an offset of +512 was 
used, positioning the blanking level at the mid- 
point of the 10-bit DAC output level. 

As the sample-and-hold action of the DAC 
introduces a (sin x)/x characteristic, the video 
data may be digitally filtered by a [(sin x)/x\~ 1 
filter to compensate. Alternately, as an analog 
lowpass filter is usually present after the DAC, 
the correction may take place in the analog fil- 
ter. 



Video 


(M) 


NTSC-J 


(B, D, G, H, 1, N c ) 


(M, N) 


Level 


NTSC 




PAL 


PAL 


peak chroma 


328 


354 


347 


328 


peak burst 


112 


112 


117 


117 


blank 


0 


0 


0 


0 


peak burst 


-112 


-112 


-117 


-117 


peak chroma 


-328 


-354 


-347 


-328 



Table 9.3. 10-Bit Digital Chrominance Values. 
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Figure 9.10. (M) NTSC Chrominance (C) Video Signal for 75% Color Bars. Indicated video levels 
are 10-bit values. 



LU 

H 

X 

5 





BLANK LEVEL (512) 



Figure 9.11. (B, D, G, H, I) PAL Chrominance (C) Video Signal for 75% Color Bars. Indicated video 
levels are 10-bit values. 
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Figure 9.12. (M) NTSC Composite Video Signal 
bit values. 



Analog Composite Video 

The digital luminance (Y) data and the dig- 
ital chrominance (C) data are added together, 
generating digital composite color video with 
the levels shown in Table 9.4. 

The result may drive a 10-bit DAC that 
generates a 0-1. 305V output to generate the 
composite video signal. The video signal at the 
connector should have a source impedance of 
75 Q 



75% Color Bars. Indicated video levels are 10- 



Figures 9.12 and 9.13 show the video wave- 
forms for 75% color bars. The numbers in 
parentheses indicate the data value for a 10-bit 
DAC with a full-scale output value of 1.305V. 

As the sample-and-hold action of the DAC 
introduces a (sin x)/x characteristic, the video 
data may be digitally filtered by a [(sin x)/xY x 
filter to compensate. Alternately, as an analog 
lowpass filter is usually present after the DAC, 
the correction may take place in the analog fil- 
ter. 
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Figure 9.13. (B, D, G, H, I) PAL Composite Video Signal for 75% Color Bars. Indicated video levels 
are 10-bit values. 



Video 


(M) 


NTSC J 


(B, D, G, H, 1, N c ) 


(M, N) 


Level 


NTSC 




PAL 


PAL 


peak chroma 


973 


987 


983 


973 


white 


800 


800 


800 


800 


peak burst 


352 


352 


369 


357 


black 


282 


240 


252 


282 


blank 


240 


240 


252 


240 


peak burst 


128 


128 


135 


123 


peak chroma 


109 


53 


69 


109 


sync 


16 


16 


16 


16 



Table 9.4. 10-Bit Digital Composite Video Levels. 
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Black Burst Video Signal 

As an option, the encoder can generate a 
black burst (or house sync) video signal that 
can be used to synchronize multiple video 
sources. Figures 9.14 and 9.15 illustrate the 



black burst video signals. Note that these are 
the same as analog composite, but do not con- 
tain any active video information. The numbers 
in parentheses indicate the data value for a 10- 
bit DAC with a full-scale output value of 1.305V. 
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Figure 9.14. (M) NTSC Black Burst Video Signal. Indicated video levels are 10-bit values. 
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Figure 9.15. (B, D, G, H, I) PAL Black Burst Video Signal. Indicated video levels are 10-bit values. 
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Color Subcarrier Generation 

The color subcarrier can be generated 
from the sample clock using a discrete time 
oscillator (DTO). 

When generating video that may be used 
for editing, it is important to maintain the 
phase relationship between the color subcar- 
rier and sync information. Unless the subcar- 
rier phase relative to the sync phase is 
properly maintained, an edit may result in a 
momentary color shift. PAL also requires the 
addition of a PAL switch, which is used to 
invert the polarity of the V data every other 
scan line. Note that the polarity of the PAL 
switch should be maintained through the 
encoding and decoding process. 

Since in this design the color subcarrier is 
derived from the sample clock, any jitter in the 
sample clock will result in a corresponding 
subcarrier frequency jitter. In some PCs, the 
sample clock is generated using a phase-lock 
loop (PLL) , which may not have the necessary 
clock stability to keep the subcarrier phase jit- 
ter below 2°-3°. 

Frequency Relationships 

(M) NTSC, NTSC-J 

As shown in Chapter 8, there is a defined 
relationship between the subcarrier frequency 
(Fsc) and the line frequency (F H ) : 

Psc/Pi [ = 910/4 

Assuming (for example only) a 13.5 MHz 
sample clock rate (Fg): 

Fg = 858 F| [ 

Combining these equations produces the 
relationship between Fsc and Fg: 



F sc /F s = 35/132 

which may also be expressed in terms of the 
sample clock period (Tg) and the subcarrier 
period (T sc ): 

T s /T sc = 35/132 

The color subcarrier phase must be 
advanced by this fraction of a subcarrier cycle 
each sample clock. 

(B, D, G, H, I, N) PAL 

As shown in Chapter 8, there is a defined 
relationship between the subcarrier frequency 
(Fgc) and the line frequency (F H ) : 

F sc /F h = (1135/4) + (1/625) 

Assuming (for example only) a 13.5 MHz 
sample clock rate (Fg): 

Fg = 864 Fj [ 

Combining these equations produces the 
relationship between F sc and Fg: 

F sc /F s = 709379/2160000 

which may also be expressed in terms of the 
sample clock period (Tg) and the subcarrier 
period (T sc ): 

Tg/T sc = 709379/2160000 

The color subcarrier phase must be 
advanced by this fraction of a subcarrier cycle 
each sample clock. 

(Nc) PAL 

In the (Nc) PAL video standard used in 
Argentina, there is a different relationship 
between the subcarrier frequency (Fgc) and 
the line frequency (F H ) : 
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F sc /F h = (917/4) + (1/625) 

Assuming (for example only) a 13.5 MHz 
sample clock rate (Fg): 

Fg - 864 I) [ 

Combining these equations produces the 
relationship between Fsc and Fg: 

F SC /F S = 573129/2160000 

which may also be expressed in terms of the 
sample clock period (Tg) and the subcarrier 
period (T sc ): 

T s /T sc = 573129/2160000 

The color subcarrier phase must be 
advanced by this fraction of a subcarrier cycle 
each sample clock. 

Quadrature Subcarrier Generation 

A DTO consists of an accumulator in 
which a smaller number [p] is added modulo 
to another number [q] . The counter consists of 
an adder and a register as shown in Figure 
9.16. The contents of the register are con- 
strained so that if they exceed or equal [q], [q] 
is subtracted from the contents. The output 
signal (X N ) of the adder is: 

Xn = (Xn_i + p) modulo q 

With each clock cycle, [p] is added to pro- 
duce a linearly increasing series of digital val- 
ues. It is important that [q] not be an integer 
multiple of [p] so that the generated values are 
continuously different and the remainder 
changes from one cycle to the next. 



p <+} 



Figure 9.16. Single Stage DTO. 

The DTO is used to reduce the sample 
clock frequency, Fg, to the color subcarrier fre- 
quency, F sc : 

Fsc = (p/q) F s 

Since [p] is of finite word length, the DTO out- 
put frequency can be varied only in steps. With 
a [p] word length of [w] , the lowest [p] step is 
0.5w and the lowest DTO frequency step is: 

F sc = F s /2 w 

Note that the output frequency cannot be 
greater than half the input frequency. This 
means that the output frequency Fg^ can only 
be varied by the increment [p] and within the 
range: 

0 < Fsc < Fs/2 

In this application, an overflow corresponds to 
the completion of a full cycle of the subcarrier. 

Since only the remainder (which repre- 
sents the subcarrier phase) is required, the 
number of whole cycles completed is of no 
interest. During each clock cycle, the output of 
the [q] register shows the relative phase of a 
subcarrier frequency in qths of a subcarrier 
period. By using the [q] register contents to 
address a ROM containing a sine wave charac- 
teristic, a numerical representation of the sam- 
pled subcarrier sine wave can be generated. 
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Single-Stage DTO 

A single 24-bit or 32-bit modulo [q] regis- 
ter may be used, with the 11 most significant 
bits providing the subcarrier reference phase. 
An example of this architecture is shown in 
Figure 9.16. 

Multi-Stage DTO 

More long-term accuracy may be achieved 
if the ratio is partitioned into two or three frac- 
tions, the more significant of which provides 
the subcarrier reference phase, as shown in 
Figure 9.17. 

To use the full capacity of the ROM and 
make the overflow automatic, the denominator 
of the most significant fraction is made a power 
of two. The 4x HCOUNT denominator of the 
least significant fraction is used to simplify 
hardware calculations. 

Subdividing the subcarrier period into 
2048 phase steps, and using the total number 
of samples per scan line (HCOUNT) , the ratio 
may be partitioned as follows: 



PI + \L±± 

ISC = (4 )(HCOUNT) 

FS 2048 



PI and P2 are programmed to generate the 
desired color subcarrier frequency (F^ c ) . The 
modulo 4x HCOUNT and modulo 2048 
counters should be reset at the beginning of 
each vertical sync of field one to ensure the 
generation of the correct subcarrier reference 
(as shown in Figures 8.5 and 8.16). 



The less significant stage produces a 
sequence of carry bits which correct the 
approximate ratio of the upper stage by alter- 
ing the counting step by one: from PI to PI + 1. 
The upper stage produces an 11-bit subcarrier 
phase used to address the sine and cosine 
ROMs. 

Although the upper stage adder automati- 
cally overflows to provide modulo 2048 opera- 
tion, the lower stage requires additional 
circuitry because 4x HCOUNT may not be 
(and usually isn’t) an integer power of two. In 
this case, the 16-bit register has a maximum 
capacity of 65535 and the adder generates a 
carry for any value greater than this. To pro- 
duce the correct carry sequence, it is neces- 
sary, each time the adder overflows, to adjust 
the next number added to make up the differ- 
ence between 65535 and 4x HCOUNT. This 
requires: 

P3 = 65536 - (4) (HCOUNT) + P2 

Although this changes the contents of the 
lower stage register, the sequence of carry bits 
is unchanged, ensuring that the correct phase 
values are generated. 

The PI and P2 values are determined for 
(M) NTSC operation using the following equa- 
tion: 



FSC = 


P\ + 


(P 2) 


( 4)(HCOUNT ) 


FS 




2048 




C910^ 


\( 1 1 




V 4 / 


AH COUNT) 
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UPPER STAGE 




Figure 9.17. 3-Stage DTO Chrominance Subcarrier Generation 
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The PI and P2 values are determined for 
(B, D, G, H, I, N) PAL operation using the fol- 
lowing equation: 

PI + — 

FSC _ ( 4)(HCOUNT ) 

FS 2048 



71135 


+ 1 ) 


f 1 ) 


2 4 


6252 


\H COUNT) 



The PI and P2 values are determined for the 
version of (N c ) PAL used in Argentina using 
the following equation: 

P2 

PI + — 

FSC _ (4 )(HCOUNT) 

FS 2048 



(917 + 1 4 


( 1 ) 


v 4 6252 


\H COUNT) 



The modulo 625 counter, with a [p] value 
of 67, is used during 625-line operation to more 
accurately adjust subcarrier generation due to 
the 0.1072 remainder after calculating the PI 
and P2 values. During 525-line operation, the 
carry signal should always be forced to be 
zero. Table 9.5 lists some of the common hori- 
zontal resolutions, sample clock rates, and 
their corresponding HCOUNT, PI, and P2 val- 
ues. 

Sine and Cosine Generation 

Regardless of the type of DTO used, each 
value of the 11-bit subcarrier phase corre- 
sponds to one of 2048 waveform values taken 
at a particular point in the subcarrier cycle 
period and stored in ROM. The sample points 
are taken at odd multiples of one 4096th of the 
total period to avoid end-effects when the sam- 
ple values are read out in reverse order. 



Note that only one quadrant of the subcar- 
rier wave shape is stored in ROM, as shown in 
Figure 9.18. The values for the other quadrants 
are produced using the symmetrical properties 
of the sinusoidal waveform. The maximum 
phase error using this technique is +0.09° (half 
of 360/2048), which corresponds to a maxi- 
mum amplitude error of +0.08%, relative to the 
peak-to-peak amplitude, at the steepest part of 
the sine wave signal. 

Figure 9.17 also shows a technique for 
generating quadrature subcarriers from an li- 
bit subcarrier phase signal. It uses two ROMs 
to store quadrants of sine and cosine wave- 
forms. XOR gates invert the addresses for gen- 
erating time-reversed portions of the 
waveforms and to invert the output polarity to 
make negative portions of the waveforms. An 
additional gate is provided in the sign bit for 
the V subcarrier to allow injection of a PAL 
switch square wave to implement phase inver- 
sion of the V signal on alternate scan lines. 

Horizontal and Vertical Timing 

Vertical and horizontal counters are used 
to control the video timing. 

Timing Control 

To control the horizontal and vertical 
counters, separate horizontal sync (HSYNC#) 
and vertical sync (VSYNC#) signals are com- 
monly used. A BLANK# control signal is usu- 
ally used to indicate when to generate active 
video. 

If HSYNC#, VSYNC#, and BLANK# are 
inputs, controlling the horizontal and vertical 
counters, this is referred to as “slave” timing. 
HSYNC#, VSYNC#, and BLANK# are gener- 
ated by another device in the system, and used 
by the encoder to generate the video. 
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1 3 5 



1021 1023 



4096TH'S OF A SUBCARRIER PERIOD 



90" 



Figure 9.18. Positions of the 512 Stored Sample Values 
in the sin and cos ROMs for One Quadrant of a 
Subcarrier Cycle. Samples for other quadrants are 
generated by inverting the addresses and/or sign 
values. 



Typical 

Application 


Total Samples 
per Scan Line 
(HCOUNT) 


4x 

HCOUNT 


PI 


P2 


13.5 MHz (M) NTSC 


858 


3432 


543 


104 


13.5 MHz (B, D, G, H, I) PAL 


864 


3456 


672 


2061 


12.27 MHz (M) NTSC 


780 


3120 


597 


1040 


14.75 MHz (B, D, G, H, I) PAL 


944 


3776 


615 


2253 



Table 9.5. Typical HCOUNT, PI, and P2 Values for the 3-Stage DTO in 
Figure 9.17. 
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The horizontal and vertical counters may 
also be used to generate the basic video tim- 
ing. In this case, referred to as “master” tim- 
ing, HSYNC#, VSYNC#, and BLANK# are 
outputs from the encoder, and used elsewhere 
in the system. 

For a BT.656 video interface, horizontal 
blanking (H), vertical blanking (V), and field 
(F) information are used. In this application, 
the encoder would use the H, V, and F timing 
bits directly, rather than depending on 
HSYNC#, VSYNC#, and BLANK# control sig- 
nals. 

Table 9.6 lists the typical horizontal blank 
timing for common sample clock rates. A 
blanking control signal (BLANK#) is used to 
specify when to generate active video. 

Horizontal Timing 

An 11-bit horizontal counter is incre- 
mented on each rising edge of the sample 
clock, and reset by HSYNC#. The counter 
value is monitored to determine when to assert 
and negate various control signals each scan 
line, such as the start of burst envelope, end of 
burst envelope, etc. 



During slave timing operation, if there is 
no HSYNC# pulse at the end of a line, the 
counter can either continue incrementing (rec- 
ommended) or automatically reset (not recom- 
mended) . 

Vertical Timing 

A 10-bit vertical counter is incremented on 
each leading edge of HSYNC#, and reset when 
coincident leading edges of VSYNC# and 
HSYNC# occur. Rather than exactly coincident 
falling edges, a coincident window of about +64 
clock cycles should be used to ease interfacing 
to some video timing controllers. If both the 
HSYNC# and VSYNC# leading edges are 
detected within 64 clock cycles of each other, it 
is assumed to be the beginning of Field 1. The 
counter value is monitored to determine which 
scan line is being generated. 

For interlaced (M) NTSC, color burst 
information should be disabled on scan lines 
1-9 and 264-272, inclusive. On the remaining 
scan lines, color burst information should be 
enabled and disabled at the appropriate hori- 
zontal count values. 



Typical 

Application 


Sync + Back Porch 
Blanking (Samples) 


Front Porch 
Blanking 
(Samples) 


13.5 MHz (M) NTSC 


122 


16 


13.5 MHz (B, D, G, H, I) PAL 


132 


12 


12.27 MHz (M) NTSC 


126 


14 


14.75 MHz (B, D, G, H, I) PAL 


163 


13 



Table 9.6. Typical BLANK# Horizontal Timing. 
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For noninterlaced (M) NTSC, color burst 
information should be disabled on scan lines 
1-9, inclusive. A 29.97 Hz (30/1.001) offset 
may be added to the color subcarrier fre- 
quency so the subcarrier phase will be 
inverted from field to field. On the remaining 
scan lines, color burst information should be 
enabled and disabled at the appropriate hori- 
zontal count values. 

For interlaced (B, D, G, H, I, N, Nq) PAL, 
during fields 1, 2, 5, and 6, color burst informa- 
tion should be disabled on scan lines 1-6, 310- 
318, and 623-625, inclusive. During fields 3, 4, 
7, and 8, color burst information should be dis- 
abled on scan lines 1-5, 311-319, and 622-625, 
inclusive. On the remaining scan lines, color 
burst information should be enabled and dis- 
abled at the appropriate horizontal count val- 
ues. 

For noninterlaced (B, D, G, H, I, N, N<-) 
PAL, color burst information should be dis- 
abled on scan lines 1-6 and 310-312, inclusive. 
On the remaining scan lines, color burst infor- 
mation should be enabled and disabled at the 
appropriate horizontal count values. 

For interlaced (M) PAL, during fields 1, 2, 
5, and 6, color burst information should be dis- 
abled on scan lines 1-8, 260-270, and 523-525, 
inclusive. During fields 3, 4, 7, and 8, color 
burst information should be disabled on scan 
lines 1-7, 259-269, and 522-525, inclusive. On 
the remaining scan lines, color burst informa- 
tion should be enabled and disabled at the 
appropriate horizontal count values. 

For noninterlaced (M) PAL, color burst 
information should be disabled on scan lines 
1-8 and 260-262, inclusive. On the remaining 
scan lines, color burst information should be 
enabled and disabled at the appropriate hori- 
zontal count values. 



Early PAL receivers produced colored twit- 
ter at the top of the picture due to the swinging 
burst. To fix this, Bruch blanking was imple- 
mented to ensure that the phase of the first 
burst is the same following each vertical sync 
pulse. Analog encoders used a meander gate to 
control the burst reinsertion time by shifting 
one line at the vertical field rate. A digital 
encoder simply keeps track of the scan line 
and field number. Modern receivers do not 
require Bruch blanking, but it is useful for 
determining which field is being processed. 

During slave timing operation, if there is 
no VSYNC# pulse at the end of a frame, the 
counter can either continue incrementing (rec- 
ommended) or automatically reset (not recom- 
mended) . 

During master timing operation, for pro- 
video applications, it may be desirable to gen- 
erate 2.5 scan line VSYNC# pulses during 625- 
line operation. However, this may cause Field 1 
vs. Field 2 detection problems in some com- 
mercially available video chips. 

Field ID Signals 

Although the timing relationship between 
HSYNC# and VSYNC#, or the BT.656 F bit, is 
used to specify Field 1 or Field 2, additional 
signals may be used to specify which one of 
four or eight fields to generate, as shown in 
Table 9.7. 

FIELD_0 should change state coincident 
with the leading edge of VSYNC# during 
Fields 1, 3, 5, and 7. FIELD_1 should change 
state coincident with the leading edge of 
VSYNC# during Fields 1 and 5. 

For BT.656 video interface, FIELD_0 and 
FIELD_1 may be transmitted using ancillary 
data. 
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Clean Encoding 

Typically, the only filters present in a con- 
ventional encoder are the color difference low- 
pass filters. This results in considerable 
spectral overlap between the luminance and 
chrominance components, making it impossi- 
ble to separate the signals completely at the 
decoder. 

However, additional processing at the 
encoder can be used to reduce cross-color 
(luminance-to-chrominance crosstalk) and 
cross-luminance (chrominance-to-luminance 
crosstalk) decoder artifacts. Cross-color 
appears as a coarse rainbow pattern or random 
colors in regions of fine detail. Cross-lumi- 
nance appears as a fine pattern on chromi- 
nance edges. 

Cross-color in a decoder may be reduced 
by removing some of the high-frequency lumi- 



nance data in the encoder, using a notch filter 
at F§c. However, while reducing the cross- 
color, luminance detail is lost. 

A better method is to pre-comb filter the 
luminance and chrominance information in the 
encoder (see Figure 9.19). High-frequency 
luminance information is pre-combed to mini- 
mize interference with chrominance frequen- 
cies in that spectrum. Chrominance 
information also is pre-combed by averaging 
over a number of lines, reducing cross-lumi- 
nance or the hanging dot pattern. 

This technique allows fine, moving lumi- 
nance (which tends to generate cross-color at 
the decoder) to be removed while retaining full 
resolution for static luminance. However, there 
is a small loss of diagonal luminance resolution 
due to it’s being averaged over multiple lines. 
This is offset by an improvement in the 
chrominance signal-to-noise ratio (SNR) . 



FIELD 1 

Signal 


FIELD 0 
Signal 


HSYNC# and VSYNC# 
Timing Relationship 
or BT.656 F Bit 


NTSC 

Field 

Number 


PAL 

Field 

Number 


0 


0 


field 1 


1 




1 


even field 


0 


0 


field 2 


2 




2 


odd field 


0 


1 


field 1 


3 


odd field 


3 


even field 


0 


1 


field 2 


4 


even field 


4 


odd field 


1 


0 


field 1 


- 


- 


5 


even field 


1 


0 


field 2 


- 


- 


6 


odd field 


1 


1 


field 1 


- 


- 


7 


even field 


1 


1 


field 2 


- 


- 


8 


odd field 



Table 9.7. Field Numbering. 
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Figure 9.19. Clean Encoding Example. 



Bandwidth-Limited Edge Generation 

Smooth sync and blank edges may be gen- 
erated by integrating a T, or raised cosine, 
pulse to generate a T step (Figure 9.20) . NTSC 
systems use a T pulse with T = 125 ns; there- 
fore, the 2T step has little signal energy 
beyond 4 MHz. PAL systems use a T pulse 
with T = 100 ns; in this instance, the 2T step 
has little signal energy beyond 5 MHz. 



The T step provides a fast risetime, without 
ringing, within a well-defined bandwidth. The 
risetime of the edge between the 10% and 90% 
points is 0.964T. By choosing appropriate sam- 
ple values for the sync edges, blanking edges, 
and burst envelope, these values can be stored 
in a small ROM, which is triggered at the 
appropriate horizontal count. By reading the 
contents of the ROM forward and backward, 
both rising and falling edges may be gener- 
ated. 





(A) 



(B) 



Figure 9.20. Bandwidth-Limited Edge Generation. (A) NTSC T pulse. (B) The 
T step, the result of integrating the T pulse. 
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Level Limiting 

Certain highly saturated colors produce 
composite video levels that may cause prob- 
lems in downstream equipment. 

Invalid video levels greater than 100 IRE or 
less than -20 IRE (relative to the blank level) 
may be transmitted, but may cause distortion 
in VCRs or demodulators and cause sync sepa- 
ration problems. 

Illegal video levels greater than 120 IRE 
(NTSC) or 133 IRE (PAL) , or below the sync 
tip level, may not be transmitted. 

Although usually not a problem in a con- 
ventional video application, computer systems 
commonly use highly saturated colors, which 
may generate invalid or illegal video levels. It 
may be desirable to optionally limit these sig- 
nal levels to around 110 IRE, compromising 
between limiting the available colors and gen- 
erating legal video levels. 

One method of correction is to adjust the 
luminance or saturation of invalid and illegal 
pixels until the desired peak limits are 
attained. Alternately, the frame buffer contents 
may be scanned, and pixels flagged that would 
generate an invalid or illegal video level (using 
a separate overlay plane or color change) . The 
user then may change the color to a more suit- 
able one. 

In a professional editing application, the 
option of transmitting all the video information 
(including invalid and illegal levels) between 
equipment is required to minimize editing and 
processing artifacts. 

Encoder Video Parameters 

Many industry-standard video parameters 
have been defined to specify the relative qual- 
ity of NTSC and PAL encoders. To measure 
these parameters, the output of the encoder 
(while generating various video test signals 



such as those described in Chapter 8) is moni- 
tored using video test equipment. Along with a 
description of several of these parameters, typ- 
ical AC parameter values for both consumer 
and studio-quality encoders are shown in Table 
9.8. 

Several AC parameters, such as group 
delay and K factors, are dependent on the qual- 
ity of the output filters and are not discussed 
here. In addition to the AC parameters dis- 
cussed in this section, there are several others 
that should be included in an encoder specifi- 
cation, such as burst frequency and tolerance, 
horizontal frequency, horizontal blanking time, 
sync rise and fall times, burst envelope rise 
and fall times, video blanking rise and fall 
times, and the bandwidths of the YIQ or YUV 
components. 

There are also several DC parameters 
(such as white level and tolerance, blanking 
level and tolerance, sync height and tolerance, 
peak-to-peak burst amplitude and tolerance) 
that should be specified, as shown in Table 9.9. 

Differential Phase 

Differential phase distortion, commonly 
referred to as differential phase, specifies how 
much the chrominance phase is affected by 
the luminance level — in other words, how 
much hue shift occurs when the luminance 
level changes. Both positive and negative 
phase errors may be present, so differential 
phase is expressed as a peak-to-peak measure- 
ment, expressed in degrees of subcarrier 
phase. 

This parameter is measured using chroma 
of uniform phase and amplitude superimposed 
on different luminance levels, such as the mod- 
ulated ramp test signal or the modulated 5-step 
portion of the composite test signal. The differ- 
ential phase parameter for a studio-quality 
encoder may approach 0.2° or less. 




418 Chapter 9: NTSC and PAL Digital Encoding and Decoding 



Parameter 


Consumer Quality 


Studio Quality 


Units 


NTSC 


PAL 


NTSC 


PAL 


differential phase 


4 


<1 


degrees 


differential gain 


4 


<1 


% 


luminance nonlinearity 


2 


<1 


% 


hue accuracy 


3 


<1 


degrees 


color saturation accuracy 


3 


<1 


% 


residual subcarrier 


0.5 


0.1 


IRE 


SNR (per EIA-250-C) 


48 


>60 


dB 


SCH phase 


0+40 


0+20 


0+2 


degrees 


analog Y/C output skew 


5 


<2 


ns 


H tilt 


<1 


<1 


% 


V tilt 


<1 


<1 


% 


subcarrier tolerance 


10 


5 


10 


5 


Hz 



Table 9.8. Typical AC Video Parameters for (M) NTSC and (B, D, G, H, I) PAL Encoders. 



Parameter 


Consumer Quality 


Studio Quality 


Units 


NTSC 


PAL 


NTSC 


PAL 


white relative to blank 


714 +70 


700 +70 


714 ±7 


700 +7 


mV 


black relative to blank 


54 +5 


0 


54 +0.5 


0 


mV 


sync relative to blank 


-286 +30 


-300 +30 


-286 +3 


-300 +3 


mV 


burst ampbtude 


286 +30 


300 +30 


286+3 


300 +3 


mV 



Table 9.9. Typical DC Video Parameters for (M) NTSC and (B, D, G, H, I) PAL Encoders. 
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Differential Gain 

Differential gain distortion, commonly 
referred to as differential gain, specifies how 
much the chrominance gain is affected by the 
luminance level — in other words, how much 
color saturation shift occurs when the lumi- 
nance level changes. Both attenuation and 
amplification may occur, so differential gain is 
expressed as the largest amplitude change 
between any two levels, expressed as a per- 
centage of the largest chrominance amplitude. 

This parameter is measured using chroma 
of uniform phase and amplitude superimposed 
on different luminance levels, such as the mod- 
ulated ramp test signal or the modulated 5-step 
portion of the composite test signal. The differ- 
ential gain parameter for a studio-quality 
encoder may approach 0.2% or less. 

Luminance Nonlinearity 

Luminance nonlinearity, also referred to as 
differential luminance and luminance nonlin- 
ear distortion, specifies how much the lumi- 
nance gain is affected by the luminance level — 
in other words, a nonlinear relationship 
between the generated and ideal luminance 
levels. 

Using an unmodulated 5-step or 10-step 
staircase test signal, the difference between 
the largest and smallest steps, expressed as a 
percentage of the largest step, is used to spec- 
ify the luminance nonlinearity. Although this 
parameter is included within the differential 
gain and phase parameters, it is traditionally 
specified independently. 

Chrominance Nonlinear Phase Distortion 

Chrominance nonlinear phase distortion 
specifies how much the chrominance phase 
(hue) is affected by the chrominance ampli- 
tude (saturation) — in other words, how much 
hue shift occurs when the saturation changes. 



Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the phase differences 
between each chrominance packet and the 
burst are measured. The difference between 
the largest and the smallest measurements is 
the peak-to-peak value, expressed in degrees 
of subcarrier phase. This parameter is usually 
not independently specified, but is included 
within the differential gain and phase parame- 
ters. 

Chrominance Nonlinear Gain Distortion 

Chrominance nonlinear gain distortion 
specifies how much the chrominance gain is 
affected by the chrominance amplitude (satu- 
ration) — in other words, a nonlinear relation- 
ship between the generated and ideal 
chrominance amplitude levels, usually seen as 
an attenuation of highly saturated chromi- 
nance signals. 

Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the test equipment is 
adjusted so that the middle chrominance 
packet is 40 IRE. The largest difference 
between the measured and nominal values of 
the amplitudes of the other two chrominance 
packets specifies the chrominance nonlinear 
gain distortion, expressed in IRE or as a per- 
centage of the nominal amplitude of the worst- 
case packet. This parameter is usually not 
independently specified, but is included within 
the differential gain and phase parameters. 

Chrominance-to-Luminance 

Intermodulation 

Chrominance-to-luminance intermodula- 
tion, commonly referred to as cross-modula- 
tion, specifies how much the luminance level is 
affected by the chrominance. This may be the 
result of clipping highly saturated chromi- 
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nance levels or quadrature distortion and may 
show up as irregular brightness variations due 
to changes in color saturation. 

Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the largest difference 
between the ideal 50 IRE pedestal level and the 
measured luminance levels (after removal of 
chrominance information) specifies the 
chrominance-to-luminance intermodulation, 
expressed in IRE or as a percentage. This 
parameter is usually not independently speci- 
fied, but is included within the differential gain 
and phase parameters. 

Hue Accuracy 

Hue accuracy specifies how closely the 
generated hue is to the ideal hue value. Both 
positive and negative phase errors may be 
present, so hue accuracy is the difference 
between the worst-case positive and worst-case 
negative measurements from nominal, 
expressed in degrees of subcarrier phase. This 
parameter is measured using EIA or EBU 75% 
color bars as a test signal. 

Color Saturation Accuracy 

Color saturation accuracy specifies how 
closely the generated saturation is to the ideal 
saturation value, using EIA or EBU 75% color 
bars as a test signal. Both gain and attenuation 
may be present, so color saturation accuracy is 
the difference between the worst-case gain and 
worst-case attenuation measurements from 
nominal, expressed as a percentage of nomi- 
nal. 

Residual Subcarrier 

The residual subcarrier parameter speci- 
fies how much subcarrier information is 
present during white or gray (note that, ideally, 



none should be present). Excessive residual 
subcarrier is visible as noise during white or 
gray portions of the picture. 

Using an unmodulated 5-step or 10-step 
staircase test signal, the maximum peak-to- 
peak measurement of the subcarrier 
(expressed in IRE) during active video is used 
to specify the residual subcarrier relative to 
the burst amplitude. 

SCH Phase 

SCH (subcarrier to horizontal) phase 
refers to the phase relationship between the 
leading edge of horizontal sync (at the 50% 
amplitude point) and the zero crossings of the 
color burst (by extrapolating the color burst to 
the leading edge of sync). The error is 
referred to as SCH phase and is expressed in 
degrees of subcarrier phase. 

For PAL, the definition of SCH phase is 
slightly different due to the more complicated 
relationship between the sync and subcarrier 
frequencies — the SCH phase relationship for a 
given line repeats only once every eight fields. 
Therefore, PAL SCH phase is defined, per 
EBU Technical Statement D 23-1984 (E), as 
“the phase of the +U component of the color 
burst extrapolated to the half-amplitude point 
of the leading edge of the synchronizing pulse 
of line 1 of field 1.” 

SCH phase is important when merging two 
or more video signals. To avoid color shifts or 
picture jumps, the video signals must have the 
same horizontal, vertical, and subcarrier tim- 
ing and the phases must be closely matched. 
To achieve these timing constraints, the video 
signals must have the same SCH phase rela- 
tionship since the horizontal sync and subcar- 
rier are continuous signals with a defined 
relationship. It is common for an encoder to 
allow adjustment of the SCH phase to simplify 
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merging two or more video signals. Maintain- 
ing proper SCH phase is also important since 
NTSC and PAL decoders may monitor the 
SCH phase to determine which color field is 
being decoded. 

Analog Y/C Video Output Skew 

The output skew between the analog lumi- 
nance (Y) and chrominance (C) video signals 
should be minimized to avoid phase shift 
errors between the luminance and chromi- 
nance information. Excessive output skew is 
visible as artifacts along sharp vertical edges 
when viewed on a monitor. 

H Tilt 

H tilt, also known as line tilt and line time 
distortion, causes a tilt in line-rate signals, pre- 
dominantly white bars. This type of distortion 
causes variations in brightness between the 
left and right edges of an image. For a digital 
encoder, such as that described in this chapter, 
H tilt is primarily an artifact of the analog out- 
put filters and the transmission medium. 

H tilt is measured using a line bar (such as 
the one in the NTC-7 NTSC composite test sig- 
nal) and measuring the peak-to-peak deviation 
of the tilt (in IRE or percent of white bar ampli- 
tude), ignoring the first and last microsecond 
of the white bar. 

V Tilt 

V tilt, also known as field tilt and field time 
distortion, causes a tilt in field-rate signals, pre- 
dominantly white bars. This type of distortion 
causes variations in brightness between the 
top and bottom edges of an image. For a digital 
encoder, such as that described in this chapter, 

V tilt is primarily an artifact of the analog out- 
put filters and the transmission medium. 

V tilt is measured using an 18 ps, 100 IRE 
white bar in the center of 130 lines in the cen- 



ter of the field or using a field square wave. 
The peak-to-peak deviation of the tilt is mea- 
sured (in IRE or percent of white bar ampli- 
tude), ignoring the first three and last three 
lines. 

Genlocking Support 

In many instances, it is desirable to be able 
to genlock the output (align the timing signals) 
of an encoder to another composite analog 
video signal to facilitate downstream video pro- 
cessing. This requires locking the horizontal, 
vertical, and color subcarrier frequencies and 
phases together, as discussed in the NTSC/ 
PAL decoder section of this chapter. In addi- 
tion, the luminance and chrominance ampli- 
tudes must be matched. A major problem in 
genlocking is that the regenerated sample 
clock may have excessive jitter, resulting in 
color artifacts. 

One genlocking variation is to send an 
advance house sync (also known as black 
burst or advance sync) to the encoder. The 
advancement compensates for the delay from 
the house sync generator to the encoder out- 
put being used in the downstream processor, 
such as a mixer. Each video source has its own 
advanced house sync signal, so each video 
source is time-aligned at the mixing or pro- 
cessing point. 

Another genlocking option allows adjust- 
ment of the subcarrier phase so it can be 
matched with other video sources at the mix- 
ing or processing point. The subcarrier phase 
must be able to be adjusted from 0° to 360°. 
Either zero SCH phase is always maintained or 
another adjustment is allowed to indepen- 
dently position the sync and luminance infor- 
mation in about 10 ns steps. 

The output delay variation between prod- 
ucts should be within about +0.8 ns to allow 
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video signals from different genlocked devices 
to be mixed properly. Mixers usually assume 
the two video signals are perfectly genlocked, 
and excessive time skew between the two 
video signals results in poor mixing perfor- 
mance. 

Alpha Channel Support 

An encoder designed for pro-video editing 
applications may support an alpha channel. 
Eight or ten bits of digital alpha data are input, 
pipelined to match the pipeline of the encoding 
process, and converted to an analog alpha sig- 
nal (discussed in Chapter 7). Alpha is usually 
linear, with the data generating an analog alpha 
signal (also called a key) with a range of 0-100 
IRE. There is no blanking pedestal or sync 
information present. 

In computer systems that support 32-bit 
pixels, 8 bits are typically available for alpha 
information. 

NTSC and PAL Digital 
Decoding 

Although the luminance and chrominance 
components in a NTSC/PAL encoder are usu- 
ally combined by simply adding the two signals 
together, separating them in a decoder is much 
more difficult. Analog NTSC and PAL decod- 
ers have been around for some time. However, 
they have been difficult to use, required adjust- 
ment, and offered limited video quality. 

Using digital techniques to implement 
NTSC and PAL decoding offers many advan- 
tages, such as ease of use, minimum analog 
adjustments, and excellent video quality. The 
use of digital circuitry also enables the design 
of much more robust and sophisticated Y/C 
separator and genlock implementations. 



A general block diagram of a NTSC/PAL 
digital decoder is shown in Figure 9.21. 

Digitizing the Analog Video 

The first step in digital decoding of com- 
posite video signals is to digitize the entire 
composite video signal using an A/D con- 
verter (ADC). For our example, 10-bit ADCs 
are used; therefore, indicated values are 10-bit 
values. 

The composite and S-video signals are 
illustrated in Figures 9.2, 9.3. 9.10, 9.11, 9.12, 
and 9.13. 

Video inputs are usually AC-coupled and 
have a 75 Q AC and DC input impedance. As a 
result, the video signal must be DC restored 
every scan line during horizontal sync to posi- 
tion the sync tips at a known voltage level. 

The video signal must also be lowpass fil- 
tered (typically to about 6 MHz) to remove any 
high-frequency components that may result in 
aliasing. Although the video bandwidth for 
broadcast is rigidly defined, there is no stan- 
dard for consumer equipment. The video 
source generates as much bandwidth as it can; 
the receiving equipment accepts as much 
bandwidth as it can process. 

Video signals with amplitudes of 0.25x to 
2x ideal are common in the consumer market. 
The active video and/or sync signal may 
change amplitude, especially in editing situa- 
tions where the video signal may be composed 
of several different video sources merged 
together. 

In addition, the decoder should be able to 
handle 100% colors. Although only 75% colors 
may be broadcast, there is no such limitation 
for baseband video. With the frequent use of 
computer-generated text and graphics, highly 
saturated colors are becoming more common. 
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Figure 9.21. Typical NTSC/PAL Digital Decoder Implementation 
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DC Restoration 

To remove any DC offset that may be 
present in the video signal, and position it at a 
known level, DC restoration (also called clamp- 
ing) is done. 

For composite or luminance (Y) video sig- 
nals, the analog video signal is DC restored to 
the REF- voltage of the ADC during each hori- 
zontal sync time. Thus, the ADC generates a 
code of 0 during the sync level. 

For chrominance (C) video signals, the 
analog video signal is DC restored to the mid- 
point of the ADC during the horizontal sync 
time. Thus, the ADC generates a code of 512 
during the blanking level. 

Automatic Gain Control 

An automatic gain control (AGC) is used to 
ensure that a constant value for the blanking 
level is generated by the ADC. If the blanking 
level is low or high, the video signal is ampli- 
fied or attenuated until the blanking level is 
correct. 

In S-video applications, the same amount of 
gain that is applied to the luminance video sig- 
nal should also be applied to the chrominance 
video signal. 

After DC restoration and AGC processing, 
an offset of 16 is added to the digitized com- 
posite and luminance signals to match the lev- 
els used by the encoder. 

Tables 9.2, 9.3, and 9.4 show the ideal ADC 
values for composite and S-video sources after 
DC restoration and automatic gain control has 
been done. 

Blanking Level Determination 

The most common method of determining 
the blanking level is to digitally lowpass filter 
the video signal to about 0.5 MHz to remove 
subcarrier information and noise. The back 
porch is then sampled multiple times to deter- 
mine an average blanking level value. 



To limit line-to-line variations and clamp 
streaking (the result of quantizing errors) , the 
result should be averaged over 3-32 consecu- 
tive scan lines. Alternately, the back porch 
level may be determined during the vertical 
blanking interval and the result used for the 
entire field. 

Video Gain Options 

The difference from the ideal blanking 
level is processed and used in one of several 
ways to generate the correct blanking level: 

(a) controlling a voltage-controlled amplifier 

(b) adjusting the REF+ voltage of the ADC 

(c) multiplying the outputs of the ADC 

In (a) and (b) , an analog signal for control- 
ling the gain may be generated by either a 
DAC or a charge pump. If a DAC is used, it 
should have twice the resolution of the ADC to 
avoid quantizing noise. For this reason, a 
charge pump implementation may be more 
suitable. 

Option (b) is dependent on the ADC being 
able to operate over a wide range of reference 
voltages, and is therefore rarely implemented. 

Option (c) is rarely used due to the result- 
ing quantization errors from processing in the 
digital domain. 

Sync Amplitude AGC 

This is the most common mode of AGC, 
and is used where the characteristics of the 
video signal are not known. The difference 
between the measured and the ideal blanking 
level is used to determine how much to 
increase or decrease the gain of the entire 
video signal. 
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Burst Amplitude AGC 

Another method of AGC is based on the 
color burst amplitude. This is commonly used 
in pro-video applications when the sync ampli- 
tude may not be related to the active video 
amplitude. 

First, the blanking level is adjusted to the 
ideal value, regardless of the sync tip position. 
This may be done by adding or subtracting a 
DC offset to the video signal. 

Next, the burst amplitude is determined. 
To limit line-to-line variations, the burst ampli- 
tude may be averaged over 3-32 consecutive 
scan lines. 

The difference between the measured and 
the ideal burst amplitude is used to determine 
how much to increase or decrease the gain of 
the entire video signal. During the gain adjust- 
ment, the blanking value should not change. 

AGC Options 

For some pro-video applications, such as if 
the video signal levels are known to be correct, 
if all the video levels except the sync height are 
correct, or if there is excessive noise in the 
video signal, it may be desirable to disable the 
automatic gain control. 

The AGC value to use may be specified by 
the user, or the AGC value frozen once deter- 
mined. 

Y/C Separation 

When decoding composite video, the lumi- 
nance (Y) and chrominance (C) must be sepa- 
rated. The many techniques for doing this are 
discussed in detail later in the chapter. 

After Y/C separation, Y has the nominal 
values shown in Table 9.2. Note that the lumi- 
nance still contains sync and blanking informa- 
tion. Modulated chrominance has the nominal 
values shown in Table 9.3. 



The quality of Y/C separation is a major 
factor in the overall video quality generated by 
the decoder. 

Color Difference Processing 

Chrominance (C) Demodulation 

The chrominance demodulator (Figure 
9.22) accepts modulated chroma data from 
either the Y/C separator or the chroma ADC. 
It generates CbCr, UV, or IQ color difference 
data. 

(M) NTSC, NTSC-J 

During active video, the chrominance data 
is demodulated using sin and cos subcarrier 
data, as shown in Figure 9.22, resulting in 
CbCr, UV, or IQ data. For this design, the li- 
bit reference subcarrier phase (see Figure 
9.32) and the burst phase are the same (180°). 

For YUV or YCbCr processing, 180° must 
be added to the 11-bit reference subcarrier 
phase during active video time so the output of 
the sin and cos ROMs have the proper subcar- 
rier phases (0° and 90°, respectively). 

For YIQ processing, 213° must be added to 
the 11-bit reference subcarrier phase during 
active video time so the output of the sin and 
cos ROMs have the proper subcarrier phases 
(33° and 123°, respectively). 

For all the equations, 

co = 27tF S c 

F sc = 3.579545 MHz 

YUV Color Space Processing 

As shown in Chapter 8, the chrominance 
signal processed by the demodulator may be 
represented by: 
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Figure 9.22. Chrominance Demodulation Example That Generates CbCr Directly. 



(U sin cot) + (V cos cot) 

U is obtained by multiplying the chromi- 
nance data by [2 sin rat], and V is obtained by 
multiplying by [2 cos rat] : 



((U sin cot) + (V cos cot)) (2 sin cot) 

= U - (U cos 2cot) + (V sin 2cot) 

((U sin cot) + (V cos cot)) (2 cos cot) 

= V + (V cos 2cot) + (U sin 2cot) 

The 2rat components are removed by low- 
pass filtering, resulting in the U and V signals 



being recovered. The demodulator multipliers 
should ensure overflow and underflow condi- 
tions are saturated to the maximum and mini- 
mum values, respectively. The UV signals are 
then rounded to nine bits plus sign and low- 
pass filtered. 

For (M) NTSC, U has a nominal range of 0 
to +226, and V has a nominal range of 0 to 
+319. 

For NTSC-J used in Japan, U has a nomi- 
nal range of 0 to +244, and V has a nominal 
range of 0 to +344. 
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YIQ Color Space Processing 

As shown in Chapter 8, for older decoders, 
the chrominance signal processed by the 
demodulator may be represented by: 

(Q sin (cot + 33°)) + (I cos (cot + 33°)) 

The subcarrier generator of the decoder pro- 
vides a 33° phase offset during active video, 
canceling the 33° phase terms in the equation. 

Q is obtained by multiplying the chromi- 
nance data by [2 sin cot], and I is obtained by 
multiplying by [2 cos cot] : 

((Q sin cot) + (I cos cot)) (2 sin cot) 

= Q - (Q cos 2cot) + (I sin 2cot) 

((Q sin cot) + (I cos cot)) (2 cos cot) 

= I + (I cos 2cot) + (Q sin 2cot) 

The 2cot components are removed by low- 
pass filtering, resulting in the I and Q signals 
being recovered. The demodulator multipliers 
should ensure overflow and underflow condi- 
tions are saturated to the maximum and mini- 
mum values, respectively. The IQ signals are 
then rounded to nine bits plus sign and low- 
pass filtered. 

For (M) NTSC, I has a nominal range of 0 
to +309, and Q has a nominal range of 0 to 
+271. 

For NTSC-J used in Japan, I has a nominal 
range of 0 to +334, and Q has a nominal range 
of 0 to +293. 

YCbCr Color Space Processing 

If the decoder is based on the YCbCr color 
space, the chrominance signal may be repre- 
sented by: 

(Cb- 512) (0.504) (sin cot) + 

(Cr- 512) (0.711) (cos cot) 



For NTSC-J systems, the equations are: 

(Cb- 512) (0.545) (sin cot) + 

(Cr- 512) (0.769) (cos cot) 

In these cases, the values in the sin and 
cos ROMs are scaled by the reciprocal of the 
indicated values to allow the demodulator to 
generate Cb and Cr data directly, instead of U 
and V data. 

(B, D, G, H, I, M, N, N c ) PAL 

During active video, the digital chromi- 
nance (C) data is demodulated using sin and 
cos subcarrier data, as shown in Figure 9.22, 
resulting in CbCr or UV data. For this design, 
the 11-bit reference subcarrier phase (see Fig- 
ure 9.32) and the burst phase are the same 
(135°). 

For all the equations, 

03 = 27tFsc 

F sc = 4.43361875 MHz 

for (B, D, G, H, I, N) PAL 

F sc = 3.58205625 MHz for (N c ) PAL 

F sc = 3.57561149 MHz for (M) PAL 

Using a switched subcarrier waveform in 
the Cr or V channel also removes the PAL 
switch modulation. Thus, [+2 cos cot] is used 
while the PAL switch is a logical zero (burst 
phase = +135°) and [-2 cos cot] is used while 
the PAL switch is a logical one (burst phase = 
225°). 

YUV Color Space 

As shown in Chapter 8, the chrominance 
signal is represented by: 

(U sin cot) ± (V cos cot) 
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U is obtained by multiplying the chrominance 
data by [2 sin cot] and V is obtained by multi- 
plying by [+2 cos cot]: 

((U sin oat) ± (V cos cot)) (2 sin cot) 

= U - (U cos 2cot) ± (V sin 2cot) 

((U sin cot) ± (V cos cot)) (± 2 cos cot) 

= V ± (U sin 2cot) + (V cos 2cot) 

The 2cot components are removed by low- 
pass filtering, resulting in the U and V signals 
being recovered. The demodulation multipli- 
ers should ensure overflow and underflow con- 
ditions are saturated to the maximum and 
minimum values, respectively. The UV signals 
are then rounded to nine bits plus sign and 
lowpass filtered. 

For (B, D, G, H, I, N<-) PAL, U has a nomi- 
nal range of 0 to +239, and V has a nominal 
range of 0 to +337. 

For (M, N) PAL, U has a nominal range of 
0 to +226, and V has a nominal range of 0 to 
+319. 

YCbCr Color Space 

If the decoder is based on the YCbCr color 
space, the chrominance signal for (B, D, G, H, 
I, N c ) PAL may be represented by: 

(Cb- 512) (0.533) (sin cot) ± 

(Cr- 512) (0.752) (cos cot) 

The chrominance signal for (M, N) PAL may 
be represented by: 

(Cb- 512) (0.504) sin cot± 

(Cr - 512) (0.711)cos cot 

In these cases, the values in the sin and 
cos ROMs are scaled by the reciprocal of the 
indicated values to allow the demodulator to 
generate Cb and Cr data directly, instead of U 
and V data. 



Hanover Bars 

If the locally generated subcarrier phase is 
incorrect, a line-to-line pattern known as 
Hanover bars results in which pairs of adjacent 
lines have a real and complementary hue 
error. As shown in Figure 9.23 with an ideal 
color of green, two adjacent lines of the display 
have a hue error (towards yellow), the next 
two have the complementary hue error 
(towards cyan) , and so on. 

This can be shown by introducing a phase 
error (9) in the locally generated subcarrier: 

((U sin cot) ± (V cos cot)) (2 sin (cot - 0)) 

= (U cos 0) -/+ (V sin 0) 

((U sin cot) ± (V cos cot)) (±2 cos (cot - 0)) 

= (V cos 0) +/- (U sin 0) 

In areas of constant color, averaging equal con- 
tributions from even and odd lines (either visu- 
ally or using a delay line), cancels the 
alternating crosstalk component, leaving only 
a desaturation of the true component by [cos 
0], 

Lowpass Filtering 

The decoder requires sharper roll-off fil- 
ters than the encoder to ensure adequate sup- 
pression of the sampling alias components. 
Note that with a 13.5 MHz sampling frequency, 
they start to become significant above 3 MHz. 

The demodulation process for (M) NTSC 
is shown spectrally in Figures 9.24 and 9.25; 
the process is similar for PAL. In both figures, 
(a) represents the spectrum of the video signal 
and (b) represents the spectrum of the subcar- 
rier used for demodulation. Convolution of (a) 
and (b) , equivalent to multiplication in the time 
domain, produces the spectrum shown in (c), 
in which the baseband spectrum has been 
shifted to be centered about F sc and -F sc . The 
chrominance is now a baseband signal, which 
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Figure 9.23. Example Display of Hanover Bars. Green is 
the ideal color. 



may be separated from the low-frequency lumi- 
nance, centered at F§q, by a lowpass filter. 

The lowpass filters after the demodulator 
are a compromise between several factors. 
Simply using a 1.3 MHz filter, such as the one 
shown in Figure 9.26, increases the amount of 
cross-color since a greater number of lumi- 
nance frequencies are included. When using 
lowpass filters with a passband greater than 
about 0.6 MHz for NTSC (4.2 - 3.58) or 1.07 
MHz for PAL (5.5 - 4.43) , the loss of the upper 
sidebands of chrominance also introduces 
ringing and color difference crosstalk. If a 1.3 
MHz lowpass filter is used, it may include 
some gain for frequencies between 0.6 MHz 
and 1.3 MHz to compensate for the loss of part 
of the upper sideband. 

Filters with a sharp cutoff accentuate 
chrominance edge ringing; for these reasons 
slow roll-off 0.6 MHz filters, such as the one 
shown in Figure 9.27, are usually used. These 
result in poorer color resolution but minimize 
cross-color, ringing, and color difference 
crosstalk on edges. 



If the decoder is to be used in a pro-video 
editing environment, the filters should have a 
maximum ripple of +0.1 dB in the passband. 
This is needed to minimize the cumulation of 
gain and loss artifacts due to the filters, espe- 
cially when multiple passes through the encod- 
ing and decoding processes are required. 

Luminance (Y) Processing 

To remove the sync and blanking informa- 
tion, Y data from either the Y/C separator or 
the luma ADC has the black level subtracted 
from it. At this point, negative Y values should 
be supported to allow test signals, keying infor- 
mation, and real-world video to pass through 
without corruption. 

A notch filter, with a center frequency of 
Fg C , is usually optional. It may be used to 
remove any remaining chroma information 
from the Y data. The notch filter is especially 
useful to help clean up the Y data when comb 
filtering Y/ C separation is used for PAL, due to 
the closeness of the PAL frequency packets. 
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Figure 9.24. Frequency Spectra for NTSC Digital Chrominance Demodulation (F s = 13.5 MHz, F sc 
= 3.58 MHz). (A) Modulated chrominance. (B) Color subcarrier. (C) U and V spectrum produced by 
convolving (A) and (B). 
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Figure 9.25. Frequency Spectra for NTSC Digital Chrominance Demodulation (F s = 12.27 MHz, F S c 
= 3.58 MHz). (A) Modulated chrominance. (B) Color subcarrier. (C) U and V spectrum produced by 
convolving (A) and (B). 
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Figure 9.26. Typical 1.3 MHz Lowpass Digital Figure 9.27. Typical 0.6 MHz Lowpass Digital 
Filter Characteristics. Filter Characteristics. 
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User Adjustments 

Contrast, Brightness, and Sharpness 

Programmable contrast, brightness, and 
sharpness adjustments may be implemented, 
as discussed in Chapter 7. In addition, color 
transient improvement may be used to 
improve the image quality. 

Hue 

A programmable hue adjustment may be 
implemented, as discussed in Chapter 7. 

Alternately, to reduce circuitry in the data 
path, the hue adjustment is usually imple- 
mented as a subcarrier phase offset that is 
added to the 11-bit reference subcarrier phase 
during the active video time (see Figure 9.32) . 
The result is to shift the phase of the sin and 
cos subcarriers by a constant amount. An li- 
bit hue adjustment allows adjustments in hue 
from 0° to 360°, in increments of 0.176°. 

Due to the alternating sign of the V compo- 
nent in PAL decoders, the sign of the phase off- 
set (9) is set to be the opposite of the V 
component. A negative sign of the phase offset 
(0) is equivalent to adding 180° to the desired 
phase shift. PAL decoders do not usually have 
a hue adjustment feature. 

Saturation 

A programmable saturation adjustment 
may be implemented, as discussed in Chapter 
7. 

Alternately, to reduce circuitry in the data 
path, the saturation adjustment may be done 
on the sin and cos values in the demodulator. 

In either case, a burst level error signal and 
the user-programmable saturation value are 
multiplied together, and the result is used to 



adjust the gain or attenuation of the color dif- 
ference signals. The intent here is to minimize 
the amount of circuitry in the color difference 
signal path. The burst level error signal is used 
in the event the burst (and thus the modulated 
chrominance information) is not at the correct 
amplitude and adjusts the saturation of the 
color difference signals appropriately. 

For more information on the burst level 
error signal, please see the Color Killer sec- 
tion. 

Automatic Skin Tone Correction 

Skin tone correction may be used in NTSC 
decoders since the eye is very sensitive to skin 
tones, and the actual colors may become 
slightly corrupted during the broadcast pro- 
cess. If the grass is not quite the proper color 
of green, it is not noticeable; however, a skin 
tone that has a green or orange tint is unac- 
ceptable. Since the skin tones are located close 
to the +1 axis, a typical skin tone corrector 
looks for colors in a specific area (Figure 9.28) , 
and any colors within that area are made a 
color that is closer to the skin tone. 

A simple skin tone corrector may halve the 
Q value for all colors that have a correspond- 
ing +1 value. However, this implementation 
also changes nonskin tone colors. A more 
sophisticated implementation is if the color has 
a value between 25% and 75% of full-scale, and 
is within +30° of the +1 axis, then Q is halved. 
This moves any colors within the skin tone 
region closer to ideal skin tone. 

It should be noted that the phase angle for 
skin tone varies between companies. Phase 
angles from 116° to 126° are used; however, 
using 123° (the +1 axis) simplifies the process- 
ing. 
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Figure 9.28. Typical Skin Tone Color Range. 



Color Killer 

If a color burst of 12.5% or less of ideal 
amplitude is detected for 128 consecutive scan 
lines, the color difference signals should be 
forced to zero. Once a color burst of 25% or 
more of ideal amplitude is detected for 128 con- 
secutive scan lines, the color difference signals 
may again be enabled. This hysteresis pre- 
vents wandering back and forth between 
enabling and disabling the color information in 
the event the burst amplitude is borderline. 

The burst level may be determined by forc- 
ing all burst samples positive and sampling the 



result multiple times to determine an average 
value. This should be averaged over three scan 
lines to limit line-to-line variations. 

The burst level error is the ideal amplitude 
divided by the average result. If no burst is 
detected, this should be used to force the color 
difference signals to zero and to disable any fil- 
tering in the luminance path, allowing maxi- 
mum resolution luminance to be output. 

Providing the ability to force the color 
decoding on or off optionally is useful in some 
applications, such as video editing. 
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Color Space Conversion 

YUV or YIQ data is usually converted to 
YCbCr or RG B' data before being output 
from the decoder. If converting to R G B' data, 
the R G B' data must be clipped at the 0 and 
1023 values to prevent wrap-around errors. 

(M) NTSC, (M, N) PAL 

YUV Color Space Processing 

Modern decoder designs are now based 
on the YUV color space. For these decoders, 
the YUV to YCbCr equations are: 

Y 601 = 1.691Y + 64 

Cb = [1.984U cos 0 B ] + [1.984V sin 0 B ] + 512 
Cr = [1.406U cos 0 R ] + [1.406V sin 0 R ] + 512 

To generate RGB' data with a range of 0- 
1023, the YUV to RGB' equations are: 

R' = 1.975Y + [2.251U cos 0 R ] + [2.251V sin 0 R ] 

G' = 1.975Y- 0.779U - 1.146V 

B' = 1.975Y + [4.013U cos 0 B ] + [4.013V sin 0 B ] 

To generate R G B' data with a nominal range 
of 64-940 for pro-video applications, the YUV 
to R G B' equations are: 

R' = 1.691Y+ 1.928V +64 

G' = 1.691Y - 0.667U - 0.982V + 64 

B' = 1.691Y + 3.436U + 64 

The ideal values for 0 R and 0j> are 90° and 0°, 
respectively. However, for consumer televi- 
sions sold in the United States, 0 R and 0j> usu- 
ally have values of 110° and 0°, respectively, or 
100° and -10°, respectively, to reduce the visi- 
bility of differential phase errors, at the cost of 
color accuracy. 



YIQ Color Space Processing 

For older NTSC decoder designs based on 
the YIQ color space, the YIQ to YCbCr equa- 
tions are: 

Y@oi = 1.692Y + 64 

Cb = -1.0811 + 1.664Q + 512 

Cr= 1.1811 + 0.765Q + 512 

To generate RGB' data with a range of 0- 
1023, the YIQ to R G B' equations are: 

R' = 1.975Y+ 1.8871 + 1.224Q 
G' = 1.975Y- 0.5361 - 1.278Q 
B' = 1.975Y- 2.1891 + 3.367Q 

To generate R G B' data with a nominal range 
of 64-940 for pro-video applications, the YIQ to 
R G B' equations are: 

R' = 1.691Y + 1.6161 + 1.048Q + 64 
G' = 1.691Y- 0.4591 - 1.094Q + 64 
B' = 1.691Y - 1.8741 + 2.883Q + 64 

YCbCr Color Space Processing 

If the design is based on the YUV color 
space, the UV to CbCr conversion may be 
avoided by scaling the sin and cos values dur- 
ing the demodulation process, or scaling the 
color difference lowpass filter coefficients. 

NTSC-J 

Since the version of (M) NTSC used in 
Japan has a 0 IRE blanking pedestal, the color 
space conversion equations are slightly differ- 
ent from those for standard (M) NTSC. 
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YUV Color Space Processing 

Modern decoder designs are now based 
on the YUV color space. For these decoders, 
the YUV to YCbCr equations are: 

Y 60 i = 1.564Y + 64 
Cb = 1.835U + 512 

Cr = [1.301U cos 0 R ] + [1.301V sin 0 R ] + 512 

To generate RGB' data with a range of 0- 
1023, the YUV to RGB' equations are: 

R' = 1.827Y + [2.082U cos 0 R ] + 

[2.082V sin 0 R ] 

G' = 1.827Y - 0.721U - 1.060V 
B' = 1.827Y + 3.712U 

To generate R'G'B' data with a nominal range 
of 64-940 for pro-video applications, the YUV 
to R G B' equations are: 

R' = 1.564Y + 1.783V + 64 

G' = 1.564Y - 0.617U - 0.908V + 64 

B' = 1.564Y + 3.179U + 64 

The ideal value for 0 R is 90°. However, for tele- 
visions sold in Japan, 0 R usually has a value of 
95° to reduce the visibility of differential phase 
errors, at the cost of color accuracy. 

YIQ Color Space Processing 

For older NTSC decoder designs based on 
the YIQ color space, the YIQ to YCbCr equa- 
tions are: 

Ygoi = 1.565Y + 64 

Cb = -1.0001 + 1.539Q + 512 

Cr = 1.0901 + 0.708Q + 512 



To generate RGB' data with a range of 0- 
1023, the YIQ to R G B' equations are: 

R' = 1.827Y + 1.7461 + 1.132Q 
G' = 1.827Y- 0.4961 -1.182Q 
B' = 1.827Y- 2.0241 + 3.115Q 

To generate R G B' data with a nominal range 
of 64-940 for pro-video applications, the YIQ to 
R G B' equations are: 

R' = 1.564Y + 1.4951 + 0.970Q + 64 
G' = 1.564Y- 0.4251 - 1.012Q + 64 
B' = 1.564Y - 1.7341 + 2.667Q + 64 

YCbCr Color Space Processing 

If the design is based on the YUV color 
space, the UV to CbCr conversion may be 
avoided by scaling the sin and cos values dur- 
ing the demodulation process, or scaling the 
color difference lowpass filter coefficients. 

(B, D, G, H, I, N c ) PAL 

YUV Color Space Processing 

The YUV to YCbCr equations are: 

Ygoi = 1.599Y + 64 
Cb = 1.875U + 512 
Cr = 1.329V + 512 

To generate RGB' data with a range of 0- 
1023, the YUV to R'G'B' equations are: 

R' = 1.867Y + 2.128V 

G' = 1.867Y- 0.737U - 1.084V 

B' = 1.867Y + 3.793U 
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To generate R G B' data with a nominal range 
of 64-940 for pro-video applications, the YUV 
to R G B' equations are: 

R' = 1.599Y + 1.822V + 64 

G' = 1.599Y - 0.631U - 0.928Y + 64 

B' = 1.599Y + 3.248U + 64 

YCbCr Color Space Processing 

The UV to CbCr conversion may be 
avoided by scaling the sin and cos values dur- 
ing the demodulation process, or scaling the 
color difference lowpass filter coefficients. 

Genlocking 

The purpose of the genlock circuitry is to 
recover a sample clock and the timing control 
signals (such as horizontal sync, vertical 
sync, and the color subcarrier) from the video 
signal. Since the original sample clock is not 
available, it is usually generated by multiply- 
ing the horizontal line frequency, Fjj, by the 
desired number of samples per line, using a 
phase-lock loop (PLL). Also, the color subcar- 
rier must be regenerated and locked to the 
color subcarrier of the video signal being 
decoded. 

There are, however, several problems. 
Video signals may contain noise, making the 
determination of sync edges unreliable. The 
amount of time between horizontal sync edges 
may vary slightly each line, particularly in ana- 
log videotape recorders (VCRs) due to 
mechanical limitations. For analog VCRs, 
instantaneous line-to-line variations are up to 
+100 ns; line variations between the beginning 
and end of a field are up to +5 ps. When analog 
VCRs are in a special feature mode, such as 
fast-forwarding or still-picture, the amount of 



time between horizontal sync signals may vary 
up to +20% from nominal. 

Vertical sync, as well as horizontal sync, 
information must be recovered. Unfortu- 
nately, analog VCRs, in addition to destroying 
the SCH phase relationship, perform head 
switching at field boundaries, usually some- 
where between the end of active video and the 
start of vertical sync. When head switching 
occurs, one video signal (field n) is replaced by 
another video signal (field n + 1) which has an 
unknown time offset from the first video sig- 
nal. There may be up to a +1/2 line variation in 
vertical timing each field. As a result, longer- 
than-normal horizontal or vertical syncs may 
be generated. 

By monitoring the horizontal line timing, it 
is possible to determine automatically whether 
the video source is in the normal or special fea- 
ture mode. During normal operation, the hori- 
zontal line time typically varies by no more 
than +5 ps over an entire field. Line timing out- 
side this +5 ps window may be used to enable 
special feature mode timing. Hysteresis should 
be used in the detection algorithms to prevent 
wandering back and forth between the normal 
and special feature operations in the event the 
video timing is borderline between the two 
modes. A typical circuit for performing the 
horizontal and vertical sync detection is shown 
in Figure 9.29. 

In the absence of a video signal, the 
decoder should be designed to optionally free- 
run, continually generating the video timing to 
the system, without missing a beat. During the 
loss of an input signal, any automatic gain cir- 
cuits should be disabled and the decoder 
should provide the option either to be transpar- 
ent (so the input source can be monitored) , to 
auto-freeze the output data (to compensate for 
short duration dropouts), or to autoblack the 
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Figure 9.29. Sync Detection and Phase Comparator Circuitry. 
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output data (to avoid potential problems driv- 
ing a mixer or VCR) . 

Horizontal Sync Detection 

Early decoders typically used analog sync 
slicing techniques to determine the midpoint 
of the leading edge of the sync pulse and used 
a PLL to multiply the horizontal frequency rate 
up to the sample clock rate. However, the lack 
of accuracy of the analog sync slicer, combined 
with the limited stability of the PLL, resulted in 
sample clock jitter and noise amplification. 
When using comb filters for Y/C separation, 
the long delay between writing and reading the 
video data means that even a small sample 
clock frequency error results in a delay that is 
a significant percentage of the subcarrier 
period, negating the effectiveness of the comb 
filter. 

Coarse Horizontal Sync Locking 

The coarse sync locking enables a faster 
lock-up time to be achieved. Digitized video is 
lowpass filtered to about 0.5 MHz to remove 
high-frequency information, such as noise and 
color subcarrier information. Performing the 
sync detection on lowpass filtered data also 
provides edge shaping in the event that fast 
sync edges (rise and fall times less than one 
clock cycle) are present. 

An 11-bit horizontal counter is incre- 
mented each sample clock cycle, resetting to 
0x001 after counting up to the HCOUNT value, 
where HCOUNT specifies the total number of 
samples per line. A value of 0x001 indicates 
that the beginning of a horizontal sync is 
expected. When the horizontal countervalue is 
(HCOUNT - 64) , a sync gate is enabled, allow- 
ing recovered sync information to be detected. 



Up to five consecutive missing sync pulses 
should be detected before any correction to 
the clock frequency or other adjustments is 
done. Once sync information has been 
detected, the sync gate is disabled until the 
next time the horizontal counter value is 
(HCOUNT - 64). This helps filter out noise, 
serration, and equalization pulses. If the lead- 
ing edge of recovered horizontal sync is not 
within ±64 clock cycles (approximately +5 ps) 
of where it is expected to be, the horizontal 
counter is reset to 0x001 to realign the edges 
more closely. 

Additional circuitry may be included to 
monitor the width of the recovered horizontal 
sync pulse. If the horizontal sync pulse is not 
approximately the correct pulse width, ignore 
it and treat it as a missing sync pulse. 

If the leading edge of recovered horizontal 
sync is within +64 sample clock cycles (approx- 
imately +5 ps) of where it is expected to be, the 
fine horizontal sync locking circuitry is used to 
fine-tune the timing. 

Fine Horizontal Sync Locking 

One-half the sync amplitude is subtracted 
from the 0.5 MHz lowpass-filtered video data 
so the sync timing reference point (50% sync 
amplitude) is at zero. 

The leading horizontal sync edge may be 
determined by summing a series of weighted 
samples from the region of the sync edge. To 
perform the filtering, the weighting factors are 
read from a ROM by a counter triggered by 
the horizontal counter. When the central 
weighting factor (A0) is coincident with the 
50% amplitude point of the leading edge of 
sync, the result integrates to zero. Typical 
weighting factors are: 
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AO = 102/4096 
A1 = 90/4096 
A 2 = 63/4096 
A3 = 34/4096 
A4 = 14/4096 
A5 = 5/4096 
A6 = 2/4096 

This arrangement uses more of the timing 
information from the sync edge and sup- 
presses noise. Note that circuitry should be 
included to avoid processing the trailing edge 
of horizontal sync. 

Figure 9.30 shows the operation of the fine 
sync phase comparator. Figure 9.30a shows 
the leading sync edge for NTSC. Figure 9.30b 
shows the weighting factors being generated, 
and when multiplied by the sync information, 
produces the waveform shown in Figure 9.30c. 
When the A0 coefficient is coincident with the 
50% amplitude point of sync, the waveform 
integrates to zero. Distortion of sync edges, 
resulting in the locking point being slightly 
shifted, is minimized by the lowpass filtering, 
effectively shaping the sync edges prior to pro- 
cessing. 

Sample Clock Generation 

The horizontal sync phase error signal 
from Figure 9.29 is used to adjust the fre- 
quency of a line-locked PLL, as shown in Fig- 
ure 9.31. A line-locked PLL always generates a 
constant number of clock cycles per line, 
regardless of any line time variations. The free- 
running frequency of the PLL should be the 
nominal sample clock frequency required (for 
example, 13.5 MHz). 

Using a YCO-based PLL has the advantage 
of a wider range of sample clock frequency 
adjustments, useful for handling video timing 
variations outside the normal video specifica- 
tions. A disadvantage is that, due to jitter in the 



sample clock, there may be visible hue arti- 
facts and poor Y/C separation. 

A VCXO-based PLL has the advantage of 
minimal sample clock jitter. However, the sam- 
ple clock frequency range may be adjusted 
only a small amount, limiting the ability of the 
decoder to handle nonstandard video timing. 

Ideally, with either design, the rising edge 
of the sample clock is aligned with the half- 
amplitude point of the leading edge of horizon- 
tal sync, and a fixed number of sample clock 
cycles per line (HCOUNT) are always gener- 
ated. 

An alternate method is to asynchronously 
sample the video signal with a fixed-frequency 
clock (for example, 13.5 MHz). Since in this 
case the sample clock is not aligned with hori- 
zontal sync, there is a phase difference 
between the actual sample position and the 
ideal sample position. As with the conventional 
genlock solution, this phase difference is 
determined by the difference between the 
recovered and expected horizontal syncs. 

The ideal sample position is defined to be 
aligned with a sample clock generated by a 
line-locked PLL. Rather than controlling the 
sample clock frequency, the horizontal sync 
phase error signal is used to control interpola- 
tion between two samples of data to generate 
the ideal sample value. If using comb filtering 
for Y/C separation, the digitized composite 
video may be interpolated to generate the ideal 
sample points, providing better Y/ C separation 
by aligning the samples more precisely. 

Vertical Sync Detection 

Digitized video is lowpass filtered to about 
0.5 MHz to remove high-frequency informa- 
tion, such as noise and color subcarrier infor- 
mation. The 10-bit vertical counter is 
incremented by each expected horizontal sync, 
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Figure 9.30. Fine Lock Phase Comparator Waveforms. 
(A) The NTSC sync leading edge. (B) The series of 
weighting factors. (C) The weighted leading edge 
samples. 
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Figure 9.31. Typical Line-Locked Sample Clock Generation. 



resetting to 0x001 after counting up to 525 or 
625. A value of 0x001 indicates that the begin- 
ning of a vertical sync for Field 1 is expected. 

The end of vertical sync intervals is 
detected and used to set the value of the verti- 
cal counter according to the mode of opera- 
tion. By monitoring the relationship of 
recovered vertical and horizontal syncs, Field 
1 vs. Field 2 information is detected. If a recov- 
ered horizontal sync occurs more than 64, but 
less than (HCOUNT/2), clock cycles after 
expected horizontal sync, the vertical counter 
is not adjusted to avoid double incrementing 
the vertical counter. If a recovered horizontal 
sync occurs (HCOUNT/2) or more clock 
cycles after the vertical counter has been 
incremented, the vertical counter is again 
incremented. 

During special feature operation, there is 
no longer any correlation between the vertical 
and horizontal timing information, so Field 1 
vs. Field 2 detection cannot be done. Thus, 
every other detection of the end of vertical 
sync should set the vertical counter accord- 
ingly in order to synthesize Field 1 and Field 2 
timing. 

Subcarrier Generation 

As with the encoder, the color subcarrier is 
generated from the sample clock using a DTO 
(Figure 9.32) , and the same frequency relation- 
ships apply as those discussed in the encoder 
section. 



Unlike the encoder, the phase of the gener- 
ated subcarrier must be continuously adjusted 
to match that of the video signal being 
decoded. 

The subcarrier locking circuitry phase 
compares the generated subcarrier and the 
incoming subcarrier, resulting in an F§c error 
signal indicating the amount of phase error. 
This Fgc error signal is added to the [p] value 
to continually adjust the step size of the DTO, 
adjusting the phase of the generated subcar- 
rier to match that of the video signal being 
decoded. 

As a 22-bit single-stage DTO is used to 
divide down the sample clock to generate the 
subcarrier in Figure 9.32, the [p] value is 
determined as follows: 

Fsc/Fs = (P/4194303) = (P/(2 22 - 1)) 

where Fgc = the desired subcarrier frequency 
and Fs = the sample clock rate. Some values of 
[p] for popular sample clock rates are shown in 
Table 9.10. 

Subcarrier Locking 

The purpose of the subcarrier locking cir- 
cuitry (Figure 9.33) is to phase lock the gener- 
ated color subcarrier to the color subcarrier of 
the video signal being decoded. 

Digital composite video (or digital chromi- 
nance video) has the blanking level subtracted 
from it. It is also gated with a burst gate to 
ensure that the data has a value of zero outside 
the burst time. The burst gate signal should be 
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Figure 9.32. Chrominance Subcarrier Generator. 
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Typical 

Application 


Total Samples 
per Scan Line 
(HCOUNT) 


P 


13.5 MHz (M) NTSC 


858 


1,112,126 


13.5 MHz (B, D, G, H, I) PAL 


864 


1,377,477 


12.27 MHz (M) NTSC 


780 


1,223,338 


14.75 MHz (B, D, G, H, I) PAL 


944 


1,260,742 



Table 9.10. Typical HCOUNT and P Values for the 1-Stage 22-Bit DTO in 
Figure 9.32. 
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Figure 9.33. Subcarrier Phase Comparator Circuitry. 
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timed to eliminate the edges of the burst, 
which may have transient distortions that will 
reduce the accuracy of the phase measure- 
ment. 

The color burst data is phase compared to 
the locally generated burst. Note that the sign 
information must also be compared so lock will 
not occur on 180° out-of-phase signals. The 
burst accumulator averages the sixteen sam- 
ples, and the accumulated values from two 
adjacent lines are averaged to produce the 
error signal. When the local subcarrier is cor- 
rectly phased, the accumulated values from 
alternate lines cancel, and the phase error sig- 
nal is zero. The error signal is sampled at the 
line rate and processed by the loop filter, which 
should be designed to achieve a lock-up time of 
about 10 lines (50 or more lines may be 
required for noisy video signals) . It is desirable 
to avoid updating the error signal during verti- 
cal intervals due to the lack of burst. The 
resulting F$c error signal is used to adjust the 
DTO that generates the local subcarrier (Fig- 
ure 9.32). 

During PAL operation, the phase detector 
also recovers the PAL switch information used 
in generating the switched V subcarrier. The 
PAL switch D flip-flop is synchronized to the 
incoming signal by comparing the local switch 
sense with the sign of the accumulated burst 
values. If the sense is consistently incorrect for 
sixteen lines, then the flip-flop is reset. 

Note the subcarrier locking circuit should 
be able to handle short-term frequency varia- 
tions (over a few frames) of +200 Hz, long-term 
frequency variations of +500 Hz, and color 
burst amplitudes of 25-200% of normal with 
short-term amplitude variations (over a few 
frames) of up to 5%. The lock-up time of ten 
lines is desirable to accommodate video sig- 
nals that may have been incorrectly edited 
(i.e., not careful about the SCH phase relation- 



ship) or nonstandard video signals due to 
freeze-framing, special effects, and so on. The 
ten lines enable the subcarrier to be locked 
before the active video time, ensuring correct 
color representation at the beginning of the 
picture. 

Video Timing Generation 

HSYNC# (Horizontal Sync) Generation 

An 11-bit horizontal counter is incre- 
mented on each rising edge of the sample 
clock. The count is monitored to determine 
when to generate the burst gate, HSYNC# out- 
put, horizontal blanking, etc. Typically, each 
time the counter is reset to 0x001, the 
HSYNC# output is asserted. The exact timing 
of HSYNC# is dependent on the video inter- 
face used, as discussed in Chapter 6. 

H (Horizontal Blanking) Generation 

A horizontal blanking signal, H, may be 
implemented to specify when the horizontal 
blanking interval occurs. The timing of H is 
dependent on the video interface used, as dis- 
cussed in Chapter 6. 

The horizontal blank timing may be user 
programmable by incorporating start and stop 
blank registers. The values of these registers 
are compared to the horizontal counter value, 
and used to assert and negate the H control 
signal. 

VSYNC# (Vertical Sync) Generation 

A 10-bit vertical counter is incremented on 
each rising edge of HSYNC#. Typically, each 
time the counter is reset to 0x001, the VSYNC# 
output is asserted. The exact timing of 
VSYNC# is dependent on the video interface 
used, as discussed in Chapter 6. 
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F (FIELD) Generation 

A field signal, F, may be implemented to 
specify whether Field 1 or Field 2 is being 
decoded. The exact timing of F is dependent 
on the video interface used, as discussed in 
Chapter 6. 

In instances where the output of an analog 
VCR is being decoded, and the VCR is in a spe- 
cial effects mode (such as still or fast-forward) , 
there is no longer enough timing information 
to determine Field 1 vs. Field 2 timing. Thus, 
the Field 1 and Field 2 timing as specified by 
the VSYNC#/HSYNC# relationship (or the F 
signal) should be synthesized and may not 
reflect the true field timing of the video signal 
being decoded. 

V (Vertical Blanking) Generation 

A vertical blanking signal, V, may be imple- 
mented to specify when the vertical blanking 
interval occurs. The exact timing of V is depen- 
dent on the video interface used, as discussed 
in Chapter 6. 

The vertical blank timing may be user pro- 
grammable by incorporating start and stop 
blank registers. The values of these registers 
are compared to the vertical counter value, and 
used to assert and negate the V control signal. 

BLANK# Generation 

The composite blanking signal, BLANK#, 
is the logical NOR of the H and V signals. 

While BLANK# is asserted, RGB data may 
be forced to be a value of 0. YCbCr data may be 
forced to an 8-bit value of 16 for Y and 128 for 
Cb and Cr. Alternately, the RGB or YCbCr data 
outputs may not be blanked, allowing vertical 
blanking interval (VBI) data, such as closed 
captioning, teletext, widescreen signaling, and 
other information to be output. 



Field Identification 

Although the timing relationship between 
the horizontal sync (HSYNC#) and vertical 
sync (VSYNC#) signals, or the F signal, may 
be used to specify whether a Field 1 vs. Field 2 
is being decoded, one or two additional signals 
may be used to specify which one of four or 
eight fields is being decoded, as shown in 
Table 9.7. We refer to these additional control 
signals as FIELD_0 and FIELD_1. 

FIELD_0 should change state at the begin- 
ning of VSYNC#, or coincident with F, during 
fields 1, 3, 5, and 7. FIELD_1 should change 
state at the beginning of VSYNC#, or coinci- 
dent with F, during fields 1 and 5. 

NTSC Field Identification 

The beginning of fields 1 and 3 may be 
determined by monitoring the relationship of 
the subcarrier phase relative to sync. As 
shown in Figure 8.5, at the beginning of field 1, 
the subcarrier phase is ideally 0° relative to 
sync; at the beginning of field 3, the subcarrier 
phase is ideally 180° relative to sync. 

In the real world, there is a tolerance in the 
SCH phase relationship. For example, 
although the ideal SCH phase relationship may 
be perfect at the source, transmitting the video 
signal over a coaxial cable may result in a shift 
of the SCH phase relationship due to cable 
characteristics. Thus, the ideal phase plus or 
minus a tolerance should be used. Although 
+40° (NTSC) or +20° (PAL) is specified as an 
acceptable tolerance by the video standards, 
many decoder designs use a tolerance of up to 
+80°. 

In the event that a SCH phase relationship 
not within the proper tolerance is detected, the 
decoder should proceed as if nothing were 
wrong. If the condition persists for several 
frames, indicating that the video source may 
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no longer be a stable video source, operation 
should change to that for an unstable video 
source. 

For unstable video sources that do not 
maintain the proper SCH relationship (such as 
analog VCRs), synthesized FIELD_0 and 
FIELD_1 outputs should be generated (for 
example, by dividing the F output signal by two 
and four) in the event the signal is required for 
memory addressing or downstream process- 
ing. 

PAL Field Identification 

The beginning of fields 1 and 5 may be 
determined by monitoring the relationship of 
the -U component of the extrapolated burst 
relative to sync. As shown in Figure 8.16, at the 
beginning of field 1, the phase is ideally 0° rela- 
tive to sync; at the beginning of field 5, the 
phase is ideally 180° relative to sync. Either the 
burst blanking sequence or the subcarrier 
phase may be used to differentiate between 
fields 1 and 3, fields 2 and 4, fields 5 and 7, and 
fields 6 and 8. All of the considerations dis- 
cussed for NTSC in the previous section also 
apply for PAL. 

Auto-Detection of Video Signal Type 

If the decoder can automatically detect the 
type of video signal being decoded, and config- 
ure itself automatically, the user will not have 
to guess at the type of video signal being pro- 
cessed. This information can be passed via sta- 
tus information to the rest of the system. 

If the decoder detects less than 575 lines 
per frame for at least 16 consecutive frames, 
the decoder can assume the video signal is 
(M) NTSC or (M) PAL. First, assume the 
video signal is (M) NTSC as that is much more 
popular. If the vertical and horizontal timing 
remains locked, but the decoder is unable to 



maintain subcarrier locking, the video signal 
may be (M) PAL. In that case, try (M) PAL 
operation and verify the burst timing. 

If the decoder detects more than 575 lines 
per frame for at least 16 consecutive frames, it 
can assume the video signal is (B, D, G, H, I, 
N, N c ) PAL or a version of SECAM. 

First, assume the video signal is (B, D, G, 
H, I, N) PAL. If the vertical and horizontal tim- 
ing remain locked, but the decoder is unable to 
maintain a subcarrier lock, it may mean the 
video signal is (N<-) PAL or SECAM. In that 
case, try SECAM operation (as that is much 
more popular), and if that doesn’t subcarrier 
lock, try (Nq) PAL operation. 

If the decoder detects a video signal format 
to which it cannot lock, this should be indi- 
cated so the user can be notified. 

Note that auto-detection cannot be per- 
formed during special feature modes of analog 
VCRs, such as fast-forwarding. If the decoder 
detects a special feature mode of operation, it 
should disable the auto-detection circuitry. 
Auto-detection should only be done when a 
video signal has been detected after the loss of 
an input video signal. 

Y/C Separation Techniques 

The encoder typically combines the lumi- 
nance and chrominance signals by simply add- 
ing them together; the result is that 
chrominance and high-frequency luminance 
signals occupy the same portion of the fre- 
quency spectrum. As a result, separating them 
in the decoder is difficult. When the signals are 
decoded, some luminance information is 
decoded as color information (referred to as 
cross-color), and some chrominance informa- 
tion remains in the luminance signal (referred 
to as cross-luminance) . Due to the stable per- 
formance of digital decoders, much more com- 
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plex separation techniques can be used than is 
possible with analog decoders. 

The presence of crosstalk is bad news in 
editing situations; crosstalk components from 
the first decoding are encoded, possibly caus- 
ing new or additional artifacts when decoded 
the next time. In addition, when a still frame is 
captured from a decoded signal, the frozen 
residual subcarrier on edges may beat with the 
subcarrier of any following encoding process, 
resulting in edge flicker in colored areas. 
Although the crosstalk problem cannot be 
solved entirely at the decoder, more elaborate 
Y/C separation minimizes the problem. 

If the decoder is used in an editing envi- 
ronment, the suppression of cross-luminance 
and cross-chrominance is more important than 
the appearance of the decoded picture. When a 
picture is decoded, processed, encoded, and 
again decoded, cross-effects can introduce 
substantial artifacts. It may be better to limit 
the luminance bandwidth (to reduce cross- 
luminance), producing softer pictures. Also, 
limiting the chrominance bandwidth to less 
than 1 MHz reduces cross-color, at the 
expense of losing chrominance definition. 

Complementary Y/C separation pre- 
serves all of the input signal. If the separated 
chrominance and luminance signals are added 
together again, the original composite video 
signal is generated. 

Noncomplementary Y/C separation intro- 
duces some irretrievable loss, resulting in gaps 
in the frequency spectrum if the separated 
chrominance and luminance signals are again 
added together to generate a composite video 
signal. The loss is due to the use of narrower 
filters to reduce cross-color and cross-lumi- 
nance. Therefore, noncomplementary filtering 
is usually unsuitable when multiple encoding 
and decoding operations must be performed, 
as the frequency spectrum gaps continually 



increase as the number of decoding operations 
increase. It does, however, enable the tweak- 
ing of luminance and chrominance response 
for optimum viewing. 

Simple Y/C Separation 

With all of these implementations, there is 
no loss of vertical chrominance resolution, but 
there is also no suppression of cross-color. For 
PAL, line-to-line errors due to differential 
phase distortion are not suppressed, resulting 
in the vertical pattern known as Hanover bars. 

Noticeable artifacts of simple Y/C separa- 
tors are color artifacts on vertical edges. These 
include color ringing, color smearing, and the 
display of color rainbows in place of high-fre- 
quency gray-scale information. 

Lowpass and Highpass Filtering 

The most basic Y/C separator assumes 
frequencies below a certain point are lumi- 
nance and above this point are chrominance. 
An example of this simple Y/C separator is 
shown in Figure 9.34. 

Frequencies below 3.0 MHz (NTSC) or 3.8 
MHz (PAL) are assumed to be luminance. Fre- 
quencies above these are assumed to be 
chrominance. Not only is high-frequency lumi- 
nance information lost, but it is assumed to be 
chrominance information, resulting in cross- 
color. 

Notch Filtering 

Although broadcast NTSC and PAL sys- 
tems are strictly bandwidth-limited, this may 
not be true of other video sources. Luminance 
information may be present all the way out to 6 
or 7 MHz or even higher. For this reason, the 
designs in Figure 6.35 are usually more appro- 
priate, as they allow high-frequency luminance 
to pass, resulting in a sharper picture. 




448 Chapter 9: NTSC and PAL Digital Encoding and Decoding 



COMPOSITE 

VIDEO 





LOWPASS FILTER 
NTSC - 3.0 MHZ 
PAL = 3.8 MHZ 












HIGHPASS FILTER 
NTSC = 3.0 MHZ 
PAL = 3.8 MHZ 









Y 



C 



Figure 9.34. Typical Simple Y/C Separator. 
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Figure 9.35. Typical Simple Y/C Separator. (A) Complementary 
filtering. (B) Noncomplementary filtering. 
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Many designs based on the notch filter 
also incorporate comb filters in the Y and color 
difference data paths to reduce cross-color and 
cross-luma artifacts. However, the notch filter 
still limits the overall Y/ C separation quality. 

PAL Considerations 

As mentioned before, PAL uses normal 
and inverted scan lines, referring to whether 
the V component is normal or inverted, to help 
correct color shifting effects due to differential 
phase distortions. 

For example, differential phase distortion 
may cause the green vector angle on normal 
scan lines to lag by 45° from the ideal 241° 
shown in Figure 8.11. This results in a vector at 
196°, effectively shifting the resulting color 
towards yellow. On inverted scan lines, the 
vector angle also will lag by 45° from the ideal 
120° shown in Figure 8.12. This results in a 
vector at 75°, effectively shifting the resulting 
color towards cyan. 

PAL Delay Line 

Figure 9.36, made by flipping Figure 8.12 
180° about the U axis and overlaying the result 
onto Figure 8.11, illustrates the cancellation of 
the phase errors. The average phase of the two 
errors, 196° on normal scan lines and 286° on 
inverted scan lines, is 241°, which is the cor- 
rect phase for green. For this reason, simple 
PAL decoders usually use a delay line (or line 
store) to facilitate averaging between two scan 
lines. 

Using delay lines in PAL Y/C separators 
has unique problems. The subcarrier refer- 
ence changes by -90° (or 270°) over one line 
period, and the V subcarrier is inverted on 
alternate lines. Thus, there is a 270° phase dif- 
ference between the input and output of a line 
delay. If we want to do a simple addition or sub- 
traction between the input and output of the 
delay line to recover chrominance information, 



the phase difference must be 0° or 180°. And 
there is still that switching Y floating around. 
Thus, we would like to find a way to align the 
subcarrier phases between lines and compen- 
sate for the switching Y. 

Simple circuits, such as the noncomple- 
mentary Y/C separator shown in Figure 9.37, 
use a delay line that is not a whole line (283.75 
subcarrier periods) , but rather 284 subcarrier 
periods. This small difference acts as a 90° 
phase shift at the subcarrier frequency. 

Since there are an integral number of sub- 
carrier periods in the delay, the U subcarriers 
at the input and output of the 284 T SC delay 
line are in phase, and they can simply be added 
together to recover the U subcarrier. The V 
subcarriers are 180° out of phase at the input 
and output of the 284 T§c delay line, due to the 
switching V, so the adder cancels them out. 
Any remaining high-frequency vertical V com- 
ponents are rejected by the U demodulator. 

Due to the switching V, subtracting the 
input and output of the 284 T§c delay line 
recovers the V subcarrier while canceling the 
U subcarrier. Any remaining high-frequency 
vertical U components are rejected by the V 
demodulator. 

Since the phase shift through the 284 T SC 
delay line is a function of frequency, the sub- 
carrier sidebands are not phase shifted exactly 
90°, resulting in hue errors on vertical chromi- 
nance transitions. Also, the chrominance and 
luminance are not vertically aligned since the 
chrominance is shifted down by one-half line. 

PAL Modifier 

Although the performance of the circuit in 
Figure 9.37 usually is adequate, the 284 T SC 
delay line may be replaced by a line delay fol- 
lowed by a PAL modifier, as shown in Figure 
9.38. The PAL modifier provides a 90° phase 
shift and inversion of the V subcarrier. Chromi- 
nance from the PAL modifier is now in phase 
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Figure 9.36. Phase Error Correction for PAL. 
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Figure 9.37. Single Delay Line PAL Y/C Separator. 
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Figure 9.38. Single-Line Delay PAL Y/C Separator Using a PAL Modifier. 



with the line delay input, allowing the two to be 
combined using a single adder and share a 
common path to the demodulators. The aver- 
aging sacrifices some vertical resolution; how- 
ever, Hanover bars are suppressed. 

Since the chrominance at the demodulator 
input is in phase with the composite video, it 
can be used to cancel the chrominance in the 
composite signal to leave luminance. However, 
the chrominance and luminance are still not 
vertically aligned since the chrominance is 
shifted down by one-half line. 

The PAL modifier produces a luminance 
alias centered at twice the subcarrier fre- 
quency. Without the bandpass filter before the 
PAL modifier and the averaging between lines, 
mixing the original and aliased luminance com- 
ponents would result in a 12.5 Hz beat fre- 
quency, noticeable in high-contrast areas of the 
picture. 



2D Comb Filtering 

In the previous Y/C separators, high-fre- 
quency luminance information is treated as 
chrominance information; no attempt is made 
to differentiate between the two. As a result, 
the luminance information is interpreted as 
chrominance information (cross-color) and 
passed on to the chroma demodulator to 
recover color information. The demodulator 
cannot differentiate between chrominance and 
high-frequency luminance, so it generates 
color where color should not exist. Thus, occa- 
sional display artifacts are generated. 

2D (or intra-field) comb filtering attempts 
to improve the separation of chrominance and 
luminance at the expense of reduced vertical 
resolution. Comb filters get their name by hav- 
ing luminance and chrominance frequency 
responses that look like a comb. Ideally, these 
frequency responses would match the comb- 
like frequency responses of the interleaved 
luminance and chrominance signals shown in 
Figures 8.4 and 8.15. 
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Modern 3-line comb filters typically use 2- 
line delays for storing the last 2 lines of video 
information (there is a 1-line delay in decoding 
using this method). Using more than 2-line 
delays usually results in excessive vertical fil- 
tering, reducing vertical resolution. 

Two-Line Delay Comb Filters 

The BBC has done research (Reference 4) 
on various PAL comb filtering implementations 
(Figures 9.39 through 9.42). Each was evalu- 
ated for artifacts and frequency response. The 
vertical frequency response for each comb fil- 
ter is shown in Figure 9.43. 

In the comb filter design of Figure 9.39, 
the chrominance phase is inverted over two 
lines of delay. A subtracter cancels most of the 
luminance, leaving double-amplitude, verti- 
cally filtered chrominance. A PAL modifier pro- 
vides a 90° phase shift and removal of the PAL 
switch inversion to phase align the chromi- 
nance with the 1-line-delayed composite video 
signal. Subtracting the chrominance from the 
composite signal leaves luminance. This 
design has the advantage of vertical alignment 
of the chrominance and luminance. However, 
there is a loss of vertical resolution and no sup- 
pression of Hanover bars. In addition, it is pos- 
sible under some circumstances to generate 
double-amplitude luminance due to the aliased 
luminance components produced by the PAL 
modifier. 

The comb filter design of Figure 9.40 is 
similar to the one in Figure 9.39. However, the 
chrominance after the PAL modifier and one 
line-delayed composite video signal are added 
to generate double-amplitude chrominance 
(since the subcarriers are in phase). Again, 
subtracting the chrominance from the compos- 
ite signal leaves luminance. In this design, 
luminance over-ranging is avoided since both 
the true and aliased luminance signals are 
halved. There is less loss of vertical resolution 



and Hanover bars are suppressed, at the 
expense of increased cross-color. 

The comb filter design in Figure 9.41 has 
the advantage of not using a PAL modifier. 
Since the chrominance phase is inverted over 
2 lines of delay, adding them together cancels 
most of the chrominance, leaving double- 
amplitude luminance. This is subtracted from 
the 1-line-delayed composite video signal to 
generate chrominance. Chrominance is then 
subtracted from the 1-line-delayed composite 
video signal to generate luminance (this is to 
maintain vertical luminance resolution). UV 
crosstalk is present as a 12.5 Hz flicker on hor- 
izontal chrominance edges, due to the chromi- 
nance signals not canceling in the adder since 
the line-to-line subcarrier phases are not 
aligned. Since there is no PAL modifier, there 
is no luminance aliasing or luminance over- 
ranging. 

The comb filter design in Figure 9.42 is a 
combination of Figures 9.39 and 9.41. The 
chrominance phase is inverted over two lines 
of delay. An adder cancels most of the chromi- 
nance, leaving double-amplitude luminance. 
This is subtracted from the 1-line-delayed com- 
posite video signal to generate chrominance 
signal (A) . In a parallel path, a subtracter can- 
cels most of the luminance, leaving double- 
amplitude, vertically filtered chrominance. A 
PAL modifier provides a 90° phase shift and 
removal of the PAL switch inversion to phase 
align to the (A) chrominance signal. These are 
added together, generating double-amplitude 
chrominance. Chrominance then is subtracted 
from the 1-line-delayed composite signal to 
generate luminance. The chrominance and 
luminance vertical frequency responses are 
the average of those for Figures 9.39 and 9.41. 
UV crosstalk is similar to that for Figure 9.41, 
but has half the amplitude. The luminance alias 
is also half that of Figure 9.39, and Hanover 
bars are suppressed. 
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Figure 9.39. Two-Line Roe PAL Y/C Separator. 
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Figure 9.40. Two-Line -6 dB Roe PAL Y/C Separator. 
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Figure 9.41. Two-Line Cosine PAL Y/C Separator. 
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Figure 9.42. Two-Line Weston PAL Y/C Separator. 
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Figure 9.43. Vertical Frequency Characteristics of the Comb Filters in Figures 9.39 
Through 9.42. 
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From these comb filter designs, the BBC 
has derived designs optimized for general 
viewing (Figure 9.44) and standards conver- 
sion (Figure 9.45) . 

For PAL applications, the best luminance 
processing (Figure 9.41) was combined with 
the optimum chrominance processing (Figure 
9.40) . The difference between the two designs 
is the chrominance recovery. For standards 
conversion (Figure 9.45) , the chrominance sig- 
nal is just the full-bandwidth composite video 
signal. Standards conversion uses vertical 



interpolation which tends to reduce moving 
and high vertical frequency components, 
including cross-luminance and cross-color. 
Thus, vertical chrominance resolution after 
processing usually will be better than that 
obtained from the circuits for general viewing. 
The circuit for general viewing (Figure 9.44) 
recovers chrominance with a goal of reducing 
cross-effects, at the expense of chrominance 
vertical resolution. 

For NTSC applications, the design of comb 
filters is easier. There are no switched subcar- 
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Figure 9.44. Two-Line Delay PAL Y/C Separator Optimized for General Viewing. 
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Figure 9.45. Two-Line Delay PAL Y/C Separator Optimized for Standards Conversion and Video 
Processing. 
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riers to worry about, and the chrominance 
phases are 180° per line, rather than 270°. In 
addition, there is greater separation between 
the luminance and chrominance frequency 
bands than in PAL, simplifying the separation 
requirements. 

In Figures 9.46 and 9.47, the adder gener- 
ates a double-amplitude composite video signal 
since the subcarriers are in phase. There is a 
180° subcarrier phase difference between the 
output of the adder and the 1-line-delayed com- 
posite video signal, so subtracting the two can- 
cels most of the luminance, leaving double 
amplitude chrominance. 



The main disadvantage of the design in 
Figure 9.46 is the unsuppressed cross-lumi- 
nance on vertical color transitions. However, 
this is offset by the increased luminance reso- 
lution over simple lowpass filtering. The rea- 
sons for processing chrominance in Figure 
9.47 are the same as for PAL in Figure 9.45. 

Adaptive Comb Filtering 

Conventional comb filters still have prob- 
lems with diagonal lines and vertical color 
changes since only vertically aligned samples 
are used for processing. 
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Figure 9.46. Two-Line Delay NTSC Y/C Separator for General Viewing. 
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Figure 9.47. Two-Line Delay NTSC Y/C Separator for Standards Conversion and Video Processing. 
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With diagonal lines, after standard comb 
filtering, the chrominance information also 
includes the difference between adjacent lumi- 
nance values, which may also be interpreted as 
chrominance information. This shows up as 
cross-color artifacts, such as a rainbow appear- 
ance along the edge of the line. 

Sharp vertical color transitions generate 
the hanging dot pattern commonly seen on the 
scan line between the two color changes. After 
standard comb filtering, the luminance infor- 
mation contains the color subcarrier. The 
amplitude of the color subcarrier is deter- 
mined by the difference between the two col- 
ors. Thus, different colors modulate the 
luminance intensity differently, creating a dot 
pattern on the scan line between two colors. To 
eliminate these hanging dots, a chroma trap fil- 
ter is sometimes used after the comb filter. 

The adaptive comb filter attempts to solve 
these problems by processing a 3 x 3, 5 x 5, or 
larger block of samples. The values of the sam- 
ples are used to determine which Y/ C separa- 
tion algorithm to use for the center sample. As 
many as 32 or more algorithms may be avail- 
able. By looking for sharp vertical transitions 
of luminance, or sharp color subcarrier phase 
changes, the operation of the comb filter is 
changed to avoid generating artifacts. 

Due to the cost of integrated line stores, 
the consumer market commonly uses 3-line 
adaptive comb filtering, with the next level of 
improvement being 3D motion adaptive comb 
filtering. 

3D Comb Filtering 

This method (also called inter-field Y/C 
separation) uses composite video data from 
the current field and from two fields (NTSC) 
or four fields (PAL) earlier. Adding the two 
cancels the chrominance (since it is 180° out of 
phase) , leaving luminance. Subtracting the two 
cancels the luminance, leaving chrominance. 



For PAL, an adequate design may be obtained 
by replacing the line delays in Figure 9.42 with 
frame delays. 

This technique provides nearly perfect Y/ 
C separation for stationary pictures. However, 
if there is any change between fields, the 
resulting Y/ C separation is erroneous. For this 
reason, inter-field Y/C separators usually are 
not used, unless as part of a 3D motion adap- 
tive comb filter. 

3D Motion Adaptive Comb Filter 

A typical implementation that uses 3D 
(inter-field) comb filtering for still areas, and 
2D (intra-field) comb filtering for areas of the 
picture that contain motion, is shown in Figure 
9.48. The motion detector generates a value 
(K) of 0-1, allowing the luminance and chromi- 
nance signals from the two comb filters to be 
proportionally mixed. Hard switching between 
algorithms is usually visible. 

Figure 9.49 illustrates a simple motion 
detector block diagram. The concept is to com- 
pare frame-to-frame changes in the low-fre- 
quency luminance signal. Its performance 
determines, to a large degree, the quality of 
the image. The motion signal (K) is usually 
rectified, smoothed by averaging horizontally 
and vertically over a few samples, multiplied 
by a gain factor, and clipped before being used. 
The only error the motion detector should 
make is to use the 2D comb filter on stationary 
areas of the image. 

Alpha Channel Support 

By incorporating an additional ADC within 
the NTSC/PAL decoder, an analog alpha sig- 
nal (also called a key) may be digitized, and 
pipelined with the video data to maintain syn- 
chronization. This allows the designer to 
change decoders (which may have different 
pipeline delays) to fit specific applications with- 
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Figure 9.48. 3D Motion Adaptive Y/C Separator. 
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Figure 9.49. Simple Motion Detector Block Diagram for NTSC. 
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out worrying about the alpha channel pipeline 
delay. Alpha is usually linear, with an analog 
range of 0-100 IRE. There is no blanking ped- 
estal or sync information present. 

Decoder Video Parameters 

Many industry-standard video parameters 
have been defined to specify the relative qual- 
ity of NTSC/PAL decoders. To measure these 
parameters, the output of the NTSC/PAL 
decoder (while decoding various video test sig- 
nals such as those described in Chapter 8) is 
monitored using video test equipment. Along 
with a description of several of these parame- 
ters, typical AC parameter values for both con- 
sumer and studio-quality decoders are shown 
in Table 9.11. 

Several AC parameters, such as short-time 
waveform distortion, group delay, and K fac- 
tors, are dependent on the quality of the analog 
video filters and are not discussed here. In 
addition to the AC parameters discussed in 
this section, there are several others that 
should be included in a decoder specification, 
such as burst capture and lock frequency 
range, and the bandwidths of the decoded YIQ 
or YUV video signals. 

There are also several DC parameters that 
should be specified, as shown in Table 9.12. 
Although genlock capabilities are not usually 
specified, except for clock jitter, we have 
attempted to generate a list of genlock parame- 
ters, shown in Table 9.13. 

Differential Phase 

Differential phase distortion, commonly 
referred to as differential phase, specifies how 
much the chrominance phase is affected by 
the luminance level — in other words, how 
much hue shift occurs when the luminance 
level changes. Both positive and negative 



phase errors may be present, so differential 
phase is expressed as a peak-to-peak measure- 
ment, expressed in degrees of subcarrier 
phase. 

This parameter is measured using a test 
signal of uniform phase and amplitude chromi- 
nance superimposed on different luminance 
levels, such as the modulated ramp test signal, 
or the modulated five-step portion of the com- 
posite test signal. The differential phase 
parameter for a studio-quality decoder may 
approach 1° or less. 

Differential Gain 

Differential gain distortion, commonly 
referred to as differential gain, specifies how 
much the chrominance gain is affected by the 
luminance level — in other words, how much 
color saturation shift occurs when the lumi- 
nance level changes. Both attenuation and 
amplification may occur, so differential gain is 
expressed as the largest amplitude change 
between any two levels, expressed as a per- 
centage of the largest chrominance amplitude. 

This parameter is measured using a test 
signal of uniform phase and amplitude chromi- 
nance superimposed on different luminance 
levels, such as the modulated ramp test signal, 
or the modulated five-step portion of the com- 
posite test signal. The differential gain parame- 
ter for a studio-quality decoder may approach 
1% or less. 

Luminance Nonlinearity 

Luminance nonlinearity, also referred to as 
differential luminance and luminance nonlin- 
ear distortion, specifies how much the lumi- 
nance gain is affected by the luminance level. 
In other words, there is a nonlinear relation- 
ship between the decoded luminance level and 
the ideal luminance level. 
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Parameter 


Consumer 

Quality 


Studio 

Quality 


Units 


differential phase 


4 


<1 


degrees 


differential gain 


4 


<1 


% 


luminance nonlinearity 


2 


<1 


% 


hue accuracy 


3 


<1 


degrees 


color saturation accuracy 


3 


<1 


% 


SNR (per E1A/TIA RS-250-C) 


48 


>60 


dB 


chrominance-to-luminance crosstalk 


<-40 


<-50 


dB 


luminance-to-chrominance crosstalk 


<-40 


<-50 


dB 


H tilt 


<1 


<1 


% 


V tilt 


<1 


<1 


% 


Y/C sampling skew 


<5 


<2 


ns 


demodulation quadrature 


90+2 


90 ±0.5 


degrees 



Table 9.11. Typical AC Video Parameters for NTSC and PAL Decoders. 



Parameter 


(M) 

NTSC 


(B, D, G, H, 1) 

PAL 


Units 


sync input amplitude 


40+20 


43 ±22 


IRE 


burst input amplitude 


40+20 


42.86 ±22 


IRE 


video input amplitude (lv nominal) 


0.5 to 2.0 


0.5 to 2.0 


volts 



Table 9.12. Typical DC Video Parameters for NTSC and PAL Decoders. 
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Parameter 


Min 


Max 


Units 


sync locking time 1 




2 


fields 


. O 

sync recovery time 




2 


fields 


short-term sync lock range 3 


+100 




ns 


long-term sync lock range 4 


+5 




]XS 


number of consecutive missing horizontal 
sync pulses before any correction 


5 




sync pulses 


vertical correlation 5 




+5 


ns 


short-term subcarrier locking range 6 


+200 




Hz 


long-term subcarrier locking range 7 


+500 




Hz 


subcarrier locking time 8 




10 


lines 


subcarrier accuracy 




+2 


degrees 



Notes: 

1. Time from start of genlock process to vertical correlation specification is achieved. 

2. Time from loss of genlock to vertical correlation specification is achieved. 

3. Range over which vertical correlation specification is maintained. Short-term range 
assumes line time changes by amount indicated slowly between two consecutive lines. 

4. Range over which vertical correlation specification is maintained. Long-term range 
assumes line time changes by amount indicated slowly over one field. 

5. Indicates vertical sample accuracy. For a genlock system that uses a VCO or VCXO, this 
specification is the same as sample clock jitter. 

6. Range over which subcarrier locking time and accuracy specifications are maintained. 
Short-term time assumes subcarrier frequency changes by amount indicated slowly over 2 
frames. 

7. Range over which subcarrier locking time and accuracy specifications are maintained. 
Long-term time assumes subcarrier frequency changes by amount indicated slowly over 
24 hours. 

8. After instantaneous 180° phase shift of subcarrier, time to lock to within ±2°. Subcarrier 
frequency is nominal ±500 Hz. 



Table 9.13. Typical Genlock Parameters for NTSC and PAL Decoders. Parameters 
assume a video signal with > 30 dB SNR and over the range of DC parameters in 
Table 9.12. 
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Using an unmodulated five-step or ten-step 
staircase test signal, or the modulated five-step 
portion of the composite test signal, the differ- 
ence between the largest and smallest steps, 
expressed as a percentage of the largest step, 
is used to specify the luminance nonlinearity. 
Although this parameter is included within the 
differential gain and phase parameters, it is tra- 
ditionally specified independently. 

Chrominance Nonlinear Phase Distortion 

Chrominance nonlinear phase distortion 
specifies how much the chrominance phase 
(hue) is affected by the chrominance ampli- 
tude (saturation) — in other words, how much 
hue shift occurs when the saturation changes. 

Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the decoder output for each 
chrominance packet is measured. The differ- 
ence between the largest and the smallest hue 
measurements is the peak-to-peak value. This 
parameter is usually not specified indepen- 
dently, but is included within the differential 
gain and phase parameters. 

Chrominance Nonlinear Gain Distortion 

Chrominance nonlinear gain distortion 
specifies how much the chrominance gain is 
affected by the chrominance amplitude (satu- 
ration). In other words, there is a nonlinear 
relationship between the decoded chromi- 
nance amplitude levels and the ideal chromi- 
nance amplitude levels — this is usually seen as 
an attenuation of highly saturated chromi- 
nance signals. 

Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the decoder is adjusted so 
that the middle chrominance packet (40 IRE) 
is decoded properly. The largest difference 



between the measured and nominal values of 
the amplitudes of the other two decoded 
chrominance packets specifies the chromi- 
nance nonlinear gain distortion, expressed in 
IRE or as a percentage of the nominal ampli- 
tude of the worst-case packet. This parameter 
is usually not specified independently, but is 
included within the differential gain and phase 
parameters. 

Chrominance-to-Luminance 

Intermodulation 

Chrominance-to-luminance intermodula- 
tion, commonly referred to as cross-modula- 
tion, specifies how much the luminance level is 
affected by the chrominance. This may be the 
result of clipping highly saturated chromi- 
nance levels or quadrature distortion and may 
show up as irregular brightness variations due 
to changes in color saturation. 

Using a modulated pedestal test signal, or 
the modulated pedestal portion of the combi- 
nation test signal, the largest difference 
between the decoded 50 IRE luminance level 
and the decoded luminance levels specifies the 
chrominance-to-luminance intermodulation, 
expressed in IRE or as a percentage. This 
parameter is usually not specified indepen- 
dently, but is included within the differential 
gain and phase parameters. 

Hue Accuracy 

Hue accuracy specifies how closely the 
decoded hue is to the ideal hue value. Both 
positive and negative phase errors may be 
present, so hue accuracy is the difference 
between the worst-case positive and worst-case 
negative measurements from nominal, 
expressed in degrees of subcarrier phase. This 
parameter is measured using EIA or EBU 75% 
color bars as a test signal. 
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Color Saturation Accuracy 

Color saturation accuracy specifies how 
close the decoded saturation is to the ideal sat- 
uration value, using EIA or EBU 75% color bars 
as a test signal. Both gain and attenuation may 
be present, so color saturation accuracy is the 
difference between the worst-case gain and 
worst-case attenuation measurements from 
nominal, expressed as a percentage of nomi- 
nal. 

H Tilt 

H tilt, also known as line tilt and line time 
distortion, causes a tilt in line-rate signals, pre- 
dominantly white bars. This type of distortion 
causes variations in brightness between the 
left and right edges of an image. For a digital 
decoder, H tilt is primarily an artifact of the 
analog input filters and the transmission 
medium. H tilt is measured using a line bar 
(such as the one in the NTC-7 NTSC compos- 
ite test signal) and measuring the peak-to-peak 
deviation of the tilt (in IRE or percentage of 
white bar amplitude) , ignoring the first and last 
microsecond of the white bar. 

V Tilt 

V tilt, also known as field tilt and field time 
distortion, causes a tilt in field-rate signals, pre- 
dominantly white bars. This type of distortion 
causes variations in brightness between the 
top and bottom edges of an image. For a digital 
decoder, V tilt is primarily an artifact of the 
analog input filters and the transmission 
medium. V tilt is measured using an 18 ps, 100 
IRE white bar in the center of 130 lines in the 
center of the field or using a field square wave. 
The peak-to-peak deviation of the tilt is mea- 
sured (in IRE or percentage of white bar ampli- 
tude) , ignoring the first and last three lines. 
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Chapter 10 



H.261 and H.263 



There are several standards for video con- 
ferencing, as shown in Table 10.1. Figures 10.1 
through 10.3 illustrate the block diagrams of 
several common video conferencing systems. 



H.261 

ITU-T H.261 was the first video compres- 
sion and decompression standard developed 
for video conferencing. Originally designed for 
bit-rates of p x 64 kbps, where p is in the range 
1-30, H.261 is now the minimum requirement 
of all video conferencing standards, as shown 
in Table 10.1. 

A typical H.261 encoder block diagram is 
shown in Figure 10.4. The video encoder pro- 
vides a self-contained digital video bitstream 
which is multiplexed with other signals, such 
as control and audio. The video decoder per- 
forms the reverse process. 

H.261 video data uses the 4:2:0 YCbCr for- 
mat shown in Figure 3.7, with the primary 
specifications listed in Table 10.2. The maxi- 
mum picture rate may be restricted by having 
0, 1, 2, or 3 non-transmitted pictures between 
transmitted ones. 



Two picture (or frame) types are sup- 
ported: 

Intra or I Frame: A frame having no reference 
frame for prediction. 

Inter or P Frame: A frame based on a previous 
frame. 

Video Coding Layer 

As shown in Figure 10.4, the basic func- 
tions are prediction, block transformation, and 
quantization. 

The prediction error (inter mode) or the 
input picture (intra mode) is subdivided into 8 
sample x 8 line blocks that are segmented as 
transmitted or non-transmitted. Four lumi- 
nance blocks and the two spatially correspond- 
ing color difference blocks are combined to 
form a 16 sample x 16 line macroblock as 
shown in Figure 10.5. 

The criteria for choice of mode and trans- 
mitting a block are not recommended and may 
be varied dynamically as part of the coding 
strategy. Transmitted blocks are transformed 
and the resulting coefficients quantized and 
variable-length coded. 
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H.310 


H.320 


H.321 


H.322 


H.323 


H.324 


3G-324M 


network 


Broadband 
ISDN 
ATM LAN 


Narrowband 

Switched 

Digital 

ISDN 


Broadband 
ISDN 
ATM LAN 


Guaranteed 

Bandwidth 

Packet 

Switched 

Networks 


Non-guaranteed 

Bandwidth 

Packet 

Switched 

Networks 

(Ethernet) 


PSTN 
or POTS 


Mobile 


video codec 


MPEG-2 

H.261 


H.261 

H.263 

H.264 


H.261 

H.263 


H.261 

H.263 


H.261 

H.263 

H.264 


H.261 

H.263 


MPEG-4.2 


audio codec 


MPEG-2 

G.711 

G.722 

G.728 


G.711 
G.722 
G.722. 1 
G.728 


G.711 

G.722 

G.728 


G.711 

G.722 

G.728 


G.711 

G.722 

G.722.1 

G.723.1 

G.728 

G.729 


G.723 


G.722.2 

G.723.1 


multiplexing 


H.222.0 

H.222.1 


H.221 


H.221 


H.221 


H.225.0 


H.223 


H.223A/B 


control 


H.245 


H.231 

H.242 

H.243 


H.242 


H.230 

H.242 


H.245 


H.245 


H.245 


multipoint 




H.231 


H.231 


H.231 


H.323 






data 


T.120 


H.239 

T.120 


T.120 


T.120 


H.239 

T.120 


T.120 


T.120 


communications 

interface 


AAL 1.363 
AJM 1.361 
PHY 1.432 


1.400 


AAL 1.363 
AJM 1.361 
PHY 1.400 


1.400 

and 

TCP/IP 


TCP/IP 


V.34 

modem 


Mobile 

Radio 



Table 10.1. Video Conferencing Family of Standards. 
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H.320 




NETWORK 

INTERFACE 

(1.400 SERIES) 



Figure 10.1. Typical H.320 System. 




NETWORK 

INTERFACE 



Figure 10.2. Typical H.323 System 





H.261 469 



H.324 




Figure 10.3. Typical H.324 System. 



Prediction 

The prediction is inter-picture and may 
include motion compensation and a spatial fil- 
ter. The coding mode using prediction is called 
inter, the coding mode using no prediction is 
called intra. 

Motion Compensation 

Motion compensation is optional in the 
encoder. The decoder must support the accep- 
tance of one motion vector per macroblock. 
Motion vectors are restricted — all samples ref- 
erenced by them must be within the coded pic- 
ture area. 

The horizontal and vertical components of 
motion vectors have integer values not exceed- 
ing +15. The motion vector is used for all four 
Y blocks in the macroblock. The motion vector 
for both the Cb and Cr blocks is derived by 
halving the values of the macroblock vector. 



A positive value of the horizontal or verti- 
cal component of the motion vector indicates 
that the prediction is formed from samples in 
the previous picture that are spatially to the 
right or below the samples being predicted. 

Loop Filter 

The prediction process may use a 2D spa- 
tial filter that operates on samples within a pre- 
dicted 8x8 block. 

The filter is separated into horizontal and 
vertical functions. Both are non-recursive with 
coefficients of 0.25, 0.5, 0.25 except at block 
edges where one of the taps falls outside the 
block. In such cases, the filter coefficients are 
changed to 0, 1, 0. 

The filter is switched on or off for all six 
blocks in a macroblock according to the mac- 
roblock type. 
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INTRA/ INTER 
FLAG 

TRANSMIT 

FLAG 

QUANTIZER 

INDICATION 

QUANTIZING 

INDEX 



MOTION 

VECTOR 

LOOP FILTER 
ON /OFF 



Figure 10.4. Typical H.261 Encoder. 



Parameters 


CIF 


QCIF 


active resolution (Y) 


352 x 288 


176 x 144 


frame rate 


29.97 Hz 


YCbCr sampling structure 


4:2:0 


form of YCbCr coding 


Uniformly quantized PCM, 8 bits per sample. 



Table 10.2. H.261 YCbCr Parameters. 
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DCT, I DCT 

Transmitted blocks are first processed by 
an 8 x 8 DCT (discrete cosine transform). The 
output from the IDCT (inverse DCT) ranges 
from -256 to +255 after clipping, represented 
using 9 bits. 

The procedures for computing the trans- 
forms are not defined, but the inverse trans- 
form must meet the specified error tolerance. 

Quantization 

Within a macroblock, the same quantizer 
is used for all coefficients, except the one for 
intra-DC. The intra-DC coefficient is usually 
linearly quantized with a step size of 8 and no 
dead zone. The other coefficients use one of 31 
possible linear quantizers, but with a central 
dead zone about 0 and a step size of an even 
value in the range of 2-62. 



Clipping of Reconstructed Picture 

Clipping functions are used to prevent 
quantization distortion of transform coefficient 
amplitudes, possibly causing arithmetic over- 
flows in the encoder and decoder loops. The 
clipping function is applied to the recon- 
structed picture, formed by summing the pre- 
diction and the prediction error. Clippers force 
sample values less than 0 to be 0 and values 
greater than 255 to be 255. 

Coding Control 

Although not included as part of H.261, 
several parameters may be varied to control 
the rate of coded video data. These include 
processing prior to coding, the quantizer, 
block significance criterion, and temporal sub- 
sampling. Temporal subsampling is performed 
by discarding complete pictures. 
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Figure 10.5. H.261 Arrangement of Group of Blocks, Macroblocks, and Blocks. 
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Forced Updating 

This is achieved by forcing the use of the 
intra mode of the coding algorithm. To control 
the accumulation of inverse transform mis- 
match errors, a macroblock should be forcibly 
updated at least once every 132 times it is 
transmitted. 

Video Bitstream 

Unless specified otherwise, the most sig- 
nificant bits are transmitted first. This is bit 1 
and is the leftmost bit in the code tables. 
Unless specified otherwise, all unused or spare 
bits are set to “1.” 



The video bitstream is a hierarchical struc- 
ture with four layers. From top to bottom the 
layers are: 

Picture 

Group of Blocks (GOB) 

Macroblock (MB) 

Block 

Picture Layer 

Data for each picture consists of a picture 
header followed by data for a group of blocks 
(GOBs). The structure is shown in Figure 
10.6. Picture headers for dropped pictures are 
not transmitted. 




BLOCK LAYER 



TCOEFF 



EOB 



Figure 10.6. H.261 Video Bitstream Layer Structures. 
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Picture Start Code (PSC) 

PSC is a 20-bit word with a value of 0000 
0000 0000 0001 0000. 

Temporal Reference (TR) 

TR is a 5-bit binary number representing 
32 possible values. It is generated by incre- 
menting the value in the previous picture 
header by one plus the number of non-trans- 
mitted pictures (at 29.97 Hz) . The arithmetic is 
performed with only the five LSBs. 

Type Information (PTYPE) 

Six bits of information about the picture 

are: 



Bit 1 


Split screen indicator 
“0” = off, “1” = on 


Bit 2 


Document camera indicator 
“0” = off, “1” = on 


Bit 3 


Freeze picture release 
“0” = off, “1” = on 


Bit 4 


Source format 
“0” = QCIF, “1” = CIF 


Bit 5 


Optional still image mode 
“0” = on, “1” = off 


Bit 6 


Spare 



Extra Insertion Information (PEI) 

PEI is a bit which when set to “1” indicates 
the presence of the following optional data 
field. 

Spare Information (PSPARE) 

If PEI is set to “1,” then these 9 bits follow 
consisting of 8 bits of data (PSPARE) and 
another PEI bit to indicate if a further 9 bits fol- 
low, and so on. 



Group of Blocks (GOB) Layer 

Each picture is divided into groups of 
blocks (GOB). A GOB comprises one-twelfth 
of the CIF picture area or one-third of the 
QCIF picture area (see Figure 10.5). A GOB 
relates to 176 samples x 48 lines of Y and the 
corresponding 88 x 24 array of Cb and Cr data. 

Data for each GOB consists of a GOB 
header followed by macroblock data, as shown 
in Figure 10.6. Each GOB header is transmit- 
ted once between picture start codes in the 
CIF or QCIF sequence numbered in Figure 
10.5, even if no macroblock data is present in 
that GOB. 

Group of Blocks Start Code ( GBSC) 

GBSC is a 16-bit word with a value of 0000 
0000 0000 0001. 

Group Number (GN) 

GN is a 4-bit binary value indicating the 
position of the group of blocks. The bits are 
the binary representation of the number in Fig- 
ure 10.5. Numbers 13, 14, and 15 are reserved 
for future use. 

Quantizer Information (GQUANT) 

GQUANT is a 5-bit binary value that indi- 
cates the quantizer used for the group of 
blocks until overridden by any subsequent 
MQUANT. Values of 1-31 are allowed. 

Extra Insertion Information ( GEI) 

GEI is a bit which, when set to “1,” indi- 
cates the presence of the following optional 
data field. 

Spare Information ( GSPARE) 

If GEI is set to “1,” then these 9 bits follow 
consisting of 8 bits of data (GSPARE) and then 
another GEI bit to indicate if a further 9 bits 
follow, and so on. 
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Macroblock (MB) Layer 

Each GOB is divided into 33 macroblocks 
as shown in Figure 10.5. A macroblock relates 
to 16 samples x 16 lines of Y and the corre- 
sponding 8x8 array of Cb and Cr data. 

Data for a macroblock consists of a macro- 
block header followed by data for blocks (see 
Figure 10.6). 

Macroblock Address (MBA) 

MBA is a variable-length codeword indicat- 
ing the position of a macroblock within a group 
of blocks. The transmission order is shown in 
Figure 10.5. For the first macroblock in a GOB, 
MBA is the absolute address in Figure 10.5. 
For subsequent macroblocks, MBA is the dif- 
ference between the absolute addresses of the 
macroblock and the last transmitted macro- 
block. The code table for MBA is given in 
Table 10.3. 

A codeword is available for bit stuffing 
immediately after a GOB header or a coded 
macroblock (called MBA stuffing). This code- 
word is discarded by decoders. 

The codeword for the start code is also 
shown in Table 10.3. MBA is always included 
in transmitted macroblocks. Macroblocks are 
not transmitted when they contain no informa- 
tion for that part of the picture. 

Type Information (MTYPE) 

MTYPE is a variable-length codeword con- 
taining information about the macroblock and 
data elements that are present. Macroblock 
types, included elements, and variable-length 
codewords are listed in Table 10.4. MTYPE is 
always included in transmitted macroblocks. 

Quantizer (MQUANT) 

MQUANT is present only if indicated by 
MTYPE. It is a 5-bit codeword indicating the 
quantizer to use for this and any following 



blocks in the group of blocks, until overridden 
by any subsequent MQUANT. Codewords for 
MQUANT are the same as for GQUANT. 

Motion Vector Data (MVD) 

Motion vector data is included for all 
motion-compensated (MC) macroblocks, as 
indicated by MTYPE. MVD is obtained from 
the macroblock vector by subtracting the vec- 
tor of the preceding macroblock. The vector of 
the previous macroblock is regarded as zero 
for the following situations: 

(a) Evaluating MVD for macroblocks 1, 12, and 
23. 

(b) Evaluating MVD for macroblocks where 
MBA does not represent a difference of 1. 

(c) MTYPE of the previous macroblock was not 
motion-compensated. 

Motion vector data consists of a variable- 
length codeword for the horizontal component, 
followed by a variable-length codeword for the 
vertical component. The variable-length codes 
are listed in Table 10.5. 

Coded Block Pattern ( CBP) 

The variable-length CBP is present if indi- 
cated by MTYPE. It indicates which blocks in 
the macroblock have at least one transform 
coefficient transmitted. The pattern number is 
represented as: 

P 0 P 1 P 2 P 3 P 4 P 5 

where P n = “1” for any coefficient present for 
block [n], else P n = “0.” Block numbering (dec- 
imal format) is given in Figure 10.5. 

The codewords for the CBP number are 
given in Table 10.6. 
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MBA 


Code 


MBA 


Code 


1 


1 






17 


0000 


0101 


10 




2 


Oil 






18 


0000 


0101 


01 




3 


010 






19 


0000 


0101 


00 




4 


0011 






20 


0000 


0100 


11 




5 


0010 






21 


0000 


0100 


10 




6 


0001 


1 




22 


0000 


0100 


Oil 




7 


0001 


0 




23 


0000 


0100 


010 




8 


0000 


111 




24 


0000 


0100 


001 




9 


0000 


110 




25 


0000 


0100 


000 




10 


0000 


1011 




26 


0000 


0011 


111 




11 


0000 


1010 




27 


0000 


0011 


110 




12 


0000 


1001 




28 


0000 


0011 


101 




13 


0000 


1000 




29 


0000 


0011 


100 




14 


0000 


0111 




30 


0000 


0011 


Oil 




15 


0000 


0110 




31 


0000 


0011 


010 




16 


0000 


0101 


11 


32 


0000 


0011 


001 






33 


0000 


0011 


000 




MBA stuffing 


0000 


0001 


111 




start code 


0000 


0000 


0000 


0001 



Table 10.3. H.261 Variable-Length Code Table for MBA. 



Prediction 


MQUANT 


MVD 


CBP 


TCOEFF 


Code 


intra 








X 


0001 






intra 


X 






X 


0000 


001 




inter 






X 


X 


1 






inter 


X 




X 


X 


0000 


1 




inter + MC 




X 






0000 


0000 


1 


inter + MC 




X 


X 


X 


0000 


0001 




inter + MC 


X 


X 


X 


X 


0000 


0000 


01 


inter + MC + FIL 




X 






001 






inter + MC + FIL 




X 


X 


X 


01 






inter + MC + FIL 


X 


X 


X 


X 


0000 


01 





Table 10.4. H.261 Variable-Length Code Table for MTYPE, 
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Block Layer 

A macroblock is made up of four Y blocks, 
a Cb block, and a Cr block (see Figure 10.5). 

Data for an 8 sample x 8 line block consists 
of codewords for the transform coefficients fol- 
lowed by an end of block (EOB) marker as 
shown in Figure 10.6. The order of block trans- 
mission is shown in Figure 10.5. 



Transform Coefficients (TCOEFF) 

When MTYPE indicates intra, transform 
coefficient data is present for all six blocks in a 
macroblock. Otherwise, MTYPE and CBP sig- 
nal which blocks have coefficient data trans- 
mitted for them. The quantized DCT 
coefficients are transmitted in the order shown 
in Figure 7.59. 



Vector Difference 


Code 


Vector Difference 


Code 


-16 & 16 


0000 


0011 


001 


1 


010 






-15 & 17 


0000 


0011 


Oil 


2 & -30 


0010 






-14 & 18 


0000 


0011 


101 


3 & -29 


0001 


0 




-13 & 19 


0000 


0011 


111 


4 & -28 


0000 


110 




-12 & 20 


0000 


0100 


001 


5 & -27 


0000 


1010 




-11 & 21 


0000 


0100 


Oil 


6 & -26 


0000 


1000 




-10 & 22 


0000 


0100 


11 


7 & -25 


0000 


0110 




-9 & 23 


0000 


0101 


01 


8 & -24 


0000 


0101 


10 


-8 & 24 


0000 


0101 


11 


9 & -23 


0000 


0101 


00 


-7 & 25 


0000 


0111 




10 & -22 


0000 


0100 


10 


-6 & 26 


0000 


1001 




11 & -21 


0000 


0100 


010 


-5 & 27 


0000 


1011 




12 & -20 


0000 


0100 


000 


-4 & 28 


0000 


111 




13 & -19 


0000 


0011 


110 


-3 & 29 


0001 


1 




14 &-18 


0000 


0011 


100 


-2 & 30 


0011 






15 &-17 


0000 


0011 


010 


-1 


Oil 








0 


1 







Table 10.5. H.261 Variable-Length Code Table for MVD. 
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□ 

■ 



INTRA (I) FRAME 



PREDICTED (P) FRAME 



Figure 10.7. Typical H.261 Decoded Sequence. 



The most common combinations of suc- 
cessive zeros (RUN) and the following value 
(LEVEL) are encoded using variable-length 
codes, listed in Table 10.7. Since CBP indicates 
blocks with no coefficient data, EOB cannot 
occur as the first coefficient. The last bit “s” 
denotes the sign of the level: “0” = positive, “1” 
= negative. 

Other combinations of (RUN, LEVEL) are 
encoded using a 20-bit word: six bits of escape 
(ESC), six bits of RUN, and eight bits of 
LEVEL, as shown in Table 10.8. 

Two code tables are used for the variable- 
length coding: one is used for the first trans- 
mitted LEVEL in inter, inter + MC, and inter + 
MC + FIL blocks; another is used for all other 
LEVELs, except for the first one in intra 
blocks, which is fixed-length coded with eight 
bits. 

All coefficients, except for intra-DC, have 
reconstruction levels (REC) in the range -2048 
to 2047. Reconstruction levels are recovered 
by the following equations, and the results are 
clipped. QUANT ranges from 1 to 31 and is 
transmitted by either GQUANT or MQUANT. 



QUANT = odd: 
for LEVEL >0 

REC = QUANT x (2 x LEVEL + 1) 
for LEVEL <0 

REC = QUANT x (2 x LEVEL - 1) 

QUANT = even: 
for LEVEL >0 

REC = (QUANT x (2 x LEVEL + 1)) - 1 
for LEVEL < 0 

REC = (QUANT x (2 x LEVEL - 1)) + 1 
for LEVEL = 0 

REC = 0 

For intra-DC blocks, the first coefficient is 
typically the transform value quantized with a 
step size of 8 and no dead zone, resulting in an 
8-bit coded value, n. Black has a coded value of 
0001 0000 (16), and white has a coded value of 
1110 1011 (235). A transform value of 1024 is 
coded as 1111 1111. Coded values of 0000 0000 
and 1000 0000 are not used. The decoded value 
is 8 n, except that an n value of 255 results in a 
reconstructed transform value of 1024. 
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Run 


Level 


Code 


4 


1 


0011 


Os 






4 


2 


0000 


0011 


11s 




4 


3 


0000 


0001 


0010 


s 


5 


1 


0001 


11s 






5 


2 


0000 


0010 


01s 




5 


3 


0000 


0000 


1001 


0s 


6 


1 


0001 


01s 






6 


2 


0000 


0001 


1110 


s 


7 


1 


0001 


00s 






7 


2 


0000 


0001 


0101 


s 


8 


1 


0000 


Ills 






8 


2 


0000 


0001 


0001 


s 


9 


1 


0000 


101s 






9 


2 


0000 


0000 


1000 


Is 


10 


1 


0010 


0111 


s 




10 


2 


0000 


0000 


1000 


0s 


11 


1 


0010 


0011 


s 




12 


1 


0010 


0010 


s 




13 


1 


0010 


0000 


s 




14 


1 


0000 


0011 


10s 




15 


1 


0000 


0011 


01s 




16 


1 


0000 


0010 


00s 




17 


1 


0000 


0001 


1111 


s 


18 


1 


0000 


0001 


1010 


s 


19 


1 


0000 


0001 


1001 


s 


20 


1 


0000 


0001 


0111 


s 


21 


1 


0000 


0001 


0110 


s 


22 


1 


0000 


0000 


1111 


Is 


23 


1 


0000 


0000 


1111 


0s 


24 


1 


0000 


0000 


1110 


Is 


25 


1 


0000 


0000 


1110 


0s 


26 


1 


0000 


0000 


1101 


Is 


ESC 




0000 


01 







Table 10.7b. H.261 Variable-Length Code Table for TCOEFF. 
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Run 


Code 


Level 


Code 


0 


0000 00 


-128 


forbidden 


1 


0000 01 


-127 


1000 0001 










63 


till 11 


-2 


1111 1110 




-1 


1111 1111 


O 


forbidden 


1 


0000 0001 


2 


0000 0010 






127 


0111 1111 



Table 10.8. H.261 Run, Level Codes. 



Still Image Transmission 

H.261 allows the transmission of a still 
image of four times the resolution of the cur- 
rently selected video format. If the video for- 
mat is QCIF, a still image of CIF resolution 
may be transmitted; if the video format is CIF, 
a still image of 704 x 576 resolution may be 
transmitted. 



H.263 

ITU-T H.263 improves on H.261 by provid- 
ing improved video quality at lower bit-rates. 

The video encoder provides a self-con- 
tained digital bitstream which is combined 
with other signals (such as H.223). The video 
decoder performs the reverse process. The 
primary specifications of H.263 regarding 
YCbCr video data are listed in Table 10.9. It is 
also possible to negotiate a custom picture 



size. The 4:2:0 YCbCr sampling is shown in 
Figure 3.7. 

Seven frame (or picture) types are sup- 
ported, with the first two being mandatory 
(baseline H.263) : 

Intra or I Frame: A frame having no reference 
frame for prediction. 

Inter or P Frame: A frame based on a previous 
frame. 

PB Frame and Improved PB Frame: A frame rep- 
resenting two frames and based on a previous 
frame. 

B Frame: A frame based two reference frames, 
one previous and one afterwards. 

El Frame: A frame having a temporally simulta- 
neous frame which has either the same or 
smaller frame size. 

EP Frame: A frame having a two reference 
frames, one previous and one simultaneous. 
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Video Coding Layer 

A typical encoder block diagram is shown 
in Figure 10.8. The basic functions are predic- 
tion, block transformation, and quantization. 

The prediction error or the input picture 
are subdivided into 8x8 blocks which are seg- 
mented as transmitted or non-transmitted. 
Four luminance blocks and the two spatially 
corresponding color difference blocks are 
combined to form a macroblock as shown in 
Figure 10.9. 

The criteria for choice of mode and trans- 
mitting a block are not recommended and may 
be varied dynamically as part of the coding 
strategy. Transmitted blocks are transformed 
and the resulting coefficients are quantized 
and variable-length coded. 

Prediction 

The prediction is interpicture and may 
include motion compensation. The coding 
mode using prediction is called inter, the cod- 
ing mode using no prediction is called intra. 

Intra-coding is signaled at the picture level 
(I frame for intra or P frame for inter) or at the 
macroblock level in P frames. In the optional 
PB frame mode, B frames always use the inter 
mode. 

Motion Compensation 

Motion compensation is optional in the en- 
coder. The decoder must support accepting 
one motion vector per macroblock (one or four 
motion vectors per macroblock in the optional 
advanced prediction or deblocking filter modes) . 

In the optional PB frame mode, each mac- 
roblock may have an additional vector. In the 
optional improved PB frame mode, each mac- 
roblock can include an additional forward 
motion vector. In the optional B frame mode, 
macroblocks can be transmitted with both a 
forward and backward motion vector. 



For baseline H.263, motion vectors are 
restricted such that all samples referenced by 
them are within the coded picture area. Many 
of the optional modes remove this restriction. 
The horizontal and vertical components of 
motion vectors have integer or half-integer val- 
ues not exceeding -16 to +15.5. Several of the 
optional modes increase the range to [-31.5, 
+31.5] or [-31.5, +30.5], 

A positive value of the horizontal or verti- 
cal component of the motion vector typically 
indicates that the prediction is formed from 
samples in the previous frame which are spa- 
tially to the right or below the samples being 
predicted. However, for backward motion vec- 
tors in B frames, a positive value of the hori- 
zontal or vertical component of the motion 
vector indicates that the prediction is formed 
from samples in the next frame which are spa- 
tially to the left or above the samples being pre- 
dicted. 

Quantization 

The number of quantizers is 1 for the first 
intra coefficient and 31 for all other coeffi- 
cients. Within a macroblock, the same quan- 
tizer is used for all coefficients except the first 
one of intra-blocks. The first intra-coefficient is 
usually the transform DC value linearly quan- 
tized with a step size of 8 and no dead zone. 
Each of the other 31 quantizers are also linear, 
but with a central dead zone around zero and a 
step size of an even value in the range of 2-62. 

Coding Control 

Although not a part of H.263, several 
parameters may be varied to control the rate of 
coded video data. These include processing 
prior to coding, the quantizer, block signifi- 
cance criterion, and temporal subsampling. 
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INTRA/ INTER 
FLAG 

TRANSMIT 

FLAG 

QUANTIZER 

INDICATION 

QUANTIZING 

INDEX 



MOTION 

VECTOR 

LOOP FILTER 
ON /OFF 



Figure 10.8. Typical Baseline H.263 Encoder. 



Parameters 


16CIF 


4CIF 


CIF 


QCIF 


SQCIF 


active resolution (Y) 


1408 x 1152 


704 x 576 


352 x 288 


176 x 144 


128 x 96 


frame rate 


29.97 Hz 


YCbCr sampling structure 


4:2:0 


form of YCbCr coding 


Uniformly quantized PCM, 8 bits per sample. 



Table 10.9. Baseline H.263 YCbCr Parameters. 
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Forced Updating 

This is achieved by forcing the use of the 
intra mode. To control the accumulation of 
inverse transform mismatch errors, a macro- 
block should be forcibly updated at least once 
every 132 times it is transmitted. 

Video Bitstream 

Unless specified otherwise, the most sig- 
nificant bits are transmitted first. Bit 1, the left- 
most bit in the code tables, is the most 
significant. Unless specified otherwise, all 
unused or spare bits are set to “1.” 



The video multiplexer is arranged in a 
hierarchical structure with four layers. From 
top to bottom the layers are: 

Picture 

Group of Blocks (GOB) or Slice 
Macroblock (MB) 

Block 

Picture Layer 

Data for each picture consists of a picture 
header followed by data for a group of blocks 
(GOBs), followed by an end-of-sequence 
(EOS) and stuffing bits (PSTUF) . The baseline 
structure is shown in Figure 10.10. Picture 
headers for dropped pictures are not transmit- 
ted. 



352 SAMPLES 



BLOCKS 



288 

LINES 



\ 

| 1 | 2 | 3 [ 4 | 5 | 6 | 7 | ... | 20 | 21 | 22 | MACROBLOCKS 




GROUP OF 




BLOCK ARRANGEMENT 
WITHIN A MACROBLOCK 



CR BLOCK 5 



CB | BLOCK 4 



BLOCK 0 


BLOCK 1 


BLOCK 2 


BLOCK 3 



Figure 10.9. H.263 Arrangement of Group of Blocks, Macroblocks, and Blocks. 



H.263 485 



Picture Start Code (PSC) 

PSC is a 22-bit word with a value of 0000 
0000 0000 0000 1 00000. It must be byte- 
aligned; therefore, 0-7 zero bits are added 
before the start code to ensure the first bit of 
the start code is the first, and most significant, 
bit of a byte. 

Temporal Reference (TR) 

TR is an 8-bit binary number representing 
256 possible values. It is generated by incre- 
menting its value in the previously transmitted 
picture header by one and adding the number 
of non-transmitted 29.97 Hz pictures since the 
last transmitted one. The arithmetic is per- 
formed with only the eight LSBs. 



If a custom picture clock frequency (PCF) 
is indicated, extended TR (ETR) and TR form a 
10-bit number where TR stores the eight LSBs 
and ETR stores the two MSBs. The arithmetic 
in this case is performed with the 10 LSBs. 

In the PB frame and improved PB frame 
mode, TR only addresses P frames. 



PICTURE LAYER PSC TR PTYPE PQUANT CPM PSBI TRB DBQUANT PEI 



PSUPP 


PEI 


GOB 


GOB 




GOB 


EOS 


PSTUF 




GOB LAYER GBSC GN GSBI GFID GQUANT MB MB - MB 



MACROBLOCK LAYER 


COD 


MCBPC 


MODB 


CBPB 


CBPY 


DQUANT 



MVD 


mvd 2 


mvd 3 


mvd 4 


MVDB 


BLOCK 0 




BLOCK 5 



BLOCK LAYER 



INTRADC TCOEF 



Figure 10.10. Baseline H.263 Video Bitstream Layer Structures (Without Optional PLUSPTYPE 
Related Fields in the Picture Layer). 
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Type Information (PTYPE) 

PTYPE contains 13 bits of information 
about the picture: 

Bit 1 “1” 

Bit 2 “0” 



Bit 12 Optional advanced prediction 
mode 

“0” = off, “1” = on 

Bit 13 Optional PB frames mode 
“0” = normal picture 
“1” = PB frame 



Bit 3 Split screen indicator 

“0” = off, “1” = on 

Bit 4 Document camera indicator 

“0” = off, “1” = on 

Bit 5 Freeze picture release 

“0” = off, “1” = on 

Bit 6-8 Source format 

“000” = reserved 
“001” = SQCIF 
“010” = QCIF 
“011” = CIF 
“100” = 4CIF 
“101” = 16CIF 
“110” = reserved 
“111” = extended PTYPE 

If bits 6-8 are not “111,” the following 5 bits 
are present in PTYPE: 

Bit 9 Picture coding type 
“0” = intra, “1” = inter 

Bit 10 Optional unrestricted motion 
vector mode 
“0” = off, “1” = on 

Bit 11 Optional syntax-based arithmetic 
coding mode 
“0” = off, “1” = on 



If bit 9 is set to “0,” bit 13 must be set to a “0.” 
Bits 10-13 are optional modes that are negoti- 
ated between the encoder and decoder. 

Quantizer Information (PQUANT) 

PQUANT is a 5-bit binary number (value 
of 1-31) representing the quantizer to be used 
until updated by a subsequent GQUANT or 
DQUANT. 

Continuous Presence Multipoint (CPM) 

CPM is a 1-bit value that signals the use of 
the optional continuous presence multipoint and 
video multiplex mode; “0” = off, “1” = on. CPM 
immediately follows PQUANT if PLUSPTYPE 
is not present, and is immediately after PLUSP- 
TYPE if PLUSPTYPE is present. 

Picture Sub-Bitstream Indicator (PSBI) 

PSBI is an optional 2-bit binary number 
that is only present if the optional continuous 
presence multipoint and video multiplex mode is 
indicated by CPM. 
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Temporal Reference ofB Frames in PB Frames 
(TRB) 

TRB is present if PTYPE or PLUSTYPE 
indicate a PB frame or improved PB frame. TRB 
is a 3-bit or 5-bit binary number of the [number 
+ 1] of nontransmitted pictures (at 29.97 Hz or 
the custom picture clock frequency indicated 
in CPCFC) since the last I or P frame or the P- 
part of a PB frame or improved PB frame and 
before the B-part of the PB frame or improved 
PB frame. The value of TRB is extended to 5 
bits when a custom picture clock frequency is 
in use. 

The maximum number of non-transmitted 
pictures is six for 29.97 Hz, or thirty when a 
custom picture clock frequency is used. 

Quantizer Information for B Frames in PB 
Frames (DBQUANT) 

DBQUANT is present if PTYPE or 
PLUSTYPE indicate a PB frame or improved 
PB frame. DBQUANT is a 2-bit codeword indi- 
cating the relationship between QUANT and 
BQUANT as shown in Table 10.10. The divi- 
sion is done using truncation. BQUANT has a 
range of 1-31. If the result is less than 1 or 
greater than 31, BQUANT is clipped to 1 and 
31, respectively. 



DBQUANT 


BQUANT 


OO 


(5 * QUANT) / 4 


01 


(6 * QUANT) / 4 


10 


(7 * QUANT) / 4 


11 


(8 * QUANT) / 4 



Table 10.10. Baseline H.263 DBQUANT 
Codes and QUANT/BQUANT Relationship. 



Extra Insertion Information (PEI) 

PEI is a bit which when set to “1” signals 
the presence of the PSUPP data field. 

Supplemental Enhancement Information 
(PSUPP) 

If PEI is set to “1,” then 9 bits follow con- 
sisting of 8 bits of data (PSUPP) and another 
PEI bit to indicate if a further 9 bits follow, and 
so on. 

End of Sequence (EOS) 

EOS is a 22-bit word with a value of 0000 
0000 0000 0000 1 mil. EOS must be byte 
aligned by inserting 0-7 zero bits before the 
code so that the first bit of the EOS code is the 
first, and most significant, bit of a byte. 

Stuffing (PSTUF) 

PSTUF is a variable-length word of zero 
bits. The last bit of PSTUF must be the last, 
and least significant, bit of a byte. 

Group of Blocks (GOB) Layer 

As shown in Figure 10.9, each picture is 
divided into groups of blocks (GOBs) . A GOB 
comprises 16 lines for the SQCIF, QCIF, and 
CIF resolutions, 32 lines for the 4CIF resolu- 
tion, and 64 lines for the 16CIF resolution. 
Thus, a SQCIF picture contains six GOBs (96/ 
16) each with one row of macroblock data. 
QCIF pictures have nine GOBs (144/16) each 
with one row of macroblock data. A CIF pic- 
ture contains eighteen GOBs (288/16) each 
with one row of macroblock data. 4CIF pic- 
tures have eighteen GOBs (576/32) each with 
two rows of macroblock data. A 16CIF picture 
has eighteen GOBs (1152/64) each with four 
rows of macroblock data. GOB numbering 
starts with 0 at the top of picture, and increases 
going down vertically. 
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Data for each GOB consists of a GOB 
header followed by macroblock data, as shown 
in Figure 10.10. Macroblock data is transmit- 
ted in increasing macroblock number order. 
For GOB number 0 in each picture, no GOB 
header is transmitted. A decoder can signal an 
encoder to transmit only non-empty GOB 
headers. 

Group of Blocks Start Code ( GBSC) 

GBSC is a 17-bit word with a value of 0000 
0000 0000 0000 1. It must be byte-aligned; 
therefore, 0-7 zero bits are added before the 
start code to ensure the first bit of the start 
code is the first, and most significant, bit of a 
byte. 

Group Number (GN) 

GN is a 5-bit binary number indicating the 
number of the GOB. Group numbers 1-17 are 
used with the standard picture formats. Group 
numbers 1-24 are used with custom picture 
formats. Group numbers 16-29 are emulated 
in the slice header. Group number 30 is used in 
the end of sub-bitstream indicators (EOSBS) 
code and group number 31 is used in the end 
of sequence (EOS) code. 

GOB Sub-Bitstream Indicator ( GSBI) 

GSBI is a 2-bit binary number represent- 
ing the sub-bitstream number until the next 
picture or GOB start code. GSBI is present 
only if continuous presence multipoint and video 
multiplex (CPM) mode is enabled. 

GOB Frame ID (GFID) 

GFID is a 2-bit value indicating the frame 
ID. It must have the same value in every GOB 
(or slice) header of a given frame. In general, if 
PTYPE is the same as for the previous picture 
header, the GFID value must be the same as 
the previous frame. If PTYPE has changed 



from the previous picture header, GFID must 
have a different value from the previous frame. 

Quantizer Information (GQUANT) 

GQUANT is a 5-bit binary number that 
indicates the quantizer to be used in the group 
of blocks until overridden by any subsequent 
GQUANT or DQUANT. The codewords are 
the binary representations of the values 1-31. 

Macroblock (MB) Layer 

Each GOB is divided into macroblocks, as 
shown in Figure 10.9. A macroblock relates to 
16 samples x 16 lines of Y and the correspond- 
ing 8 samples x 8 lines of Cb and Cr. Macro- 
block numbering increases left-to-right and 
top-to-bottom. Macroblock data is transmitted 
in increasing macroblock numbering order. 

Data for a macroblock consists of an MB 
header followed by block data (Figure 10.10). 

Coded Macroblock Indication ( COD) 

COD is a single bit that indicates whether 
or not the block is coded. “0” indicates coded; 
“1” indicates not coded, and the rest of the 
macroblock layer is empty. COD is present 
only in pictures that are not intra. 

If not coded, the decoder processes the 
macroblock as an inter-block with motion vec- 
tors equal to zero for the whole block and no 
coefficient data. 

Macroblock Type and Coded Block Pattern for 
Chrominance (MCBPC) 

MCBPC is a variable-length codeword 
indicating the macroblock type and the coded 
block pattern for Cb and Cr. 

Codewords for MCBPC are listed in 
Tables 10.11 and 10.12. A codeword is available 
for bit stuffing, and should be discarded by 
decoders. In some cases, bit stuffing must not 
occur before the first macroblock of the pic- 
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MB Type 


CBPC 
(Cb, Cr) 


Code 


3 


O, O 


1 






3 


O, 1 


001 






3 


1,0 


010 






3 


1, 1 


Oil 






4 


O, O 


0001 






4 


o, 1 


0000 


01 




4 


1,0 


0000 


10 




4 


1, 1 


0000 


11 




stuffing 


0000 


0000 


1 



Table 10.11. Baseline H.263 Variable-Length Code Table for MCBPC for I Frames. 



ture to avoid start code emulation. The mac- 
roblock types (MB type) are listed in Tables 
10.13 and 10.14. 

The coded block pattern for chrominance 
(CBPC) signifies when a non-intra-DC trans- 
form coefficient is transmitted for Cb or Cr. A 
“1” indicates a non-intra-DC coefficient is 
present in that block. 

Macroblock Mode for B Blocks (MODE) 

MODB is present for macroblock types 0- 
4 if PTYPE indicates PB frame. It is a variable- 
length codeword indicating whether B coeffi- 
cients and/or motion vectors are transmitted 
for this macroblock. Table 10.15 lists the code- 
words for MODB. MODB is coded differently 
for improved PB frames. 

Coded Block Pattern for B Blocks ( CBPB) 

The 6-bit CBPB is present if indicated by 
MODB. It indicates which blocks in the mac- 
roblock have at least one transform coefficient 
transmitted. The pattern number is repre- 
sented as: 



where P n = “1” for any coefficient present for 
block [n], else P n = “0.” Block numbering (dec- 
imal format) is given in Figure 10.9. 

Coded Block Pattern for Luminance ( CBPY) 
CBPY is a variable-length codeword speci- 
fying the Y blocks in the macroblock for which 
at least one non-intra-DC transform coefficient 
is transmitted. However, in the advanced intra- 
coding mode, intra-DC is indicated in the same 
manner as the other coefficients. 

Table 10.16 lists the codes for CBPY. Y^ is 
a “1” if any non-intra-DC coefficient is present 
for that Y block. Y block numbering (decimal 
format) is as shown in Figure 10.9. 

Quantizer Information (DQUANT) 

DQUANT is a 2-bit codeword signifying 
the change in QUANT. Table 10.17 lists the 
differential values for the codewords. 

QUANT has a range of 1-31. If the value of 
QUANT as a result of the indicated change is 
less than 1 or greater than 31, it is made 1 and 
31, respectively. 



P 0 P 1 P 2 P 3 P 4 P 5 
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Frame 

Type 


MB 

Type 


Name 


COD 


MCBPC 


CBPY 


DQUANT 


MVD 


MVD 2 _4 


inter 


not 

coded 


- 


X 












inter 


0 


inter 


X 


X 


X 




X 




inter 


1 


inter + q 


X 


X 


X 


X 


X 




inter 


2 


inter4v 


X 


X 


X 




X 


X 


inter 


3 


intra 


X 


X 


X 








inter 


4 


intra + q 


X 


X 


X 


X 






inter 


5 


inter4v + q 


X 


X 


X 


X 


X 


X 


inter 


stuffing 


- 


X 


X 










intra 


3 


intra 




X 


X 








intra 


4 


intra + q 




X 


X 


X 






intra 


stuffing 


- 




X 











Table 10.13. Baseline H.263 Macroblock Types and Included Data for Normal Frames. 



Frame 

Type 


MB 

Type 


Name 


COD 


MCBPC 


MODB 


CBPY 


inter 


not 

coded 


- 


X 








inter 


0 


inter 


X 


X 


X 


X 


inter 


1 


inter + q 


X 


X 


X 


X 


inter 


2 


inter4v 


X 


X 


X 


X 


inter 


3 


intra 


X 


X 


X 


X 


inter 


4 


intra + q 


X 


X 


X 


X 


inter 


5 


inter4v + q 


X 


X 


X 


X 


inter 


stuffing 


- 


X 


X 







Table 10.14a. Baseline H.263 Macroblock Types and Included Data for PB Frames. 
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Frame 

Type 


MB 

Type 


Name 


CBPB 


DQUANT 


MVD 


MVDB 


MVD 2 _4 


inter 


not 

coded 


- 












inter 


0 


inter 


X 




X 


X 




inter 


1 


inter + q 


X 


X 


X 


X 




inter 


2 


inter4v 


X 




X 


X 


X 


inter 


3 


intra 


X 




X 


X 




inter 


4 


intra + q 


X 


X 


X 


X 




inter 


5 


inter4v + q 


X 


X 


X 


X 


X 


inter 


stuffing 


- 













Table 10.14b. Baseline H.263 Macroblock Types and Included Data for PB Frames. 



CBPB 


MVDB 


Code 






0 




X 


10 


X 


X 


11 



Table 10.15. Baseline H.263 Variable-Length Code Table for MODB. 
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CBPY 

(Y0, Yl, Y2, Y3) 


Code 


Intra 


Inter 


0, 0, 0, 0 


1, 1, 1, 1 


0011 




0, 0, 0, 1 


1, 1, 1,0 


0010 


1 


0, 0, 1, 0 


1, 1,0, 1 


0010 


0 


0, 0, 1, 1 


1, 1,0,0 


1001 




0, 1, 0, 0 


1,0, 1, 1 


0001 


1 


0, 1, 0, 1 


1,0, 1,0 


0111 




o, 1, 1,0 


1, 0, 0, 1 


0000 


10 


0, 1, 1, 1 


1, 0, 0, 0 


1011 




1, 0, 0, 0 


0, 1, 1, 1 


0001 


0 


1, 0, 0, 1 


0, 1, 1, 0 


0000 


11 


l.o, 1,0 


0, 1, 0, 1 


0101 




1,0, 1, 1 


0, 1, 0, 0 


1010 




1, 1,0,0 


o, 0, 1, 1 


0100 




1, 1,0, 1 


0, 0, 1, 0 


1000 




1, 1, 1,0 


0, 0, 0, 1 


0110 




1, 1, 1, 1 


0, 0, 0, 0 


11 





Table 10.16. Baseline H.263 Variable-Length Code Table for CBPY. 



Differential Value 


DQUANT 


of QUANT 


-1 


00 


-2 


01 


1 


10 


2 


11 



Table 10.17. Baseline H.263 DQUANT Codes for QUANT Differential Values 
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Motion Vector Data (MVD) 

Motion vector data is included for all inter- 
macroblocks and intra-blocks when in PB 
frame mode. 

Motion vector data consists of a variable- 
length codeword for the horizontal component, 
followed by a variable-length codeword for the 
vertical component. The variable-length codes 
are listed in Table 10.18. For the unrestricted 
motion vector mode, other motion vector cod- 
ing may be used. 

Motion Vector Data (MVD 2-4) 

The three codewords MVD 2 , MVD 3 , and 
MVD 4 are present if indicated by PTYPE and 
MCBPC during the advanced prediction or 
deblocking filter modes. Each consists of a vari- 
able-length codeword for the horizontal com- 
ponent followed by a variable-length codeword 
for the vertical component. The variable-length 
codes are listed in Table 10.18. 

Motion Vector Data for B Macroblock (MVDB) 

MVDB is present if indicated by MODE 
during the PB frame and improved PB frame 
modes. It consists of a variable-length code- 
word for the horizontal component followed by 
a variable-length codeword for the vertical 
component of each vector. The variable-length 
codes are listed in Table 10.18. 

Block Layer 

If not in PB frames mode, a macroblock is 
made up of four Y blocks, a Cb block, and a Cr 
block (see Figure 10.9) . Data for an 8 sample x 
8 line block consists of codewords for the intra- 
DC coefficient and transform coefficients as 
shown in Figure 10.10. The order of block 
transmission is shown in Figure 10.9. 



In PB frames mode, a macroblock is made 
up of four Y blocks, a Cb block, a Cr block, and 
data for six B blocks. 

The quantized DCT coefficients are trans- 
mitted in the order shown in Figure 7.59. In 
the modified quantization mode, quantized 
DCT coefficients are transmitted in the order 
shown in Figure 7.60. 

DC Coefficient for Intra-Blocks (Intra-DC) 

Intra-DC is an 8-bit codeword. The values 
and their corresponding reconstruction levels 
are listed in Table 10.19. 

If not in PB frames mode, the intra-DC 
coefficient is present for every block of the 
macroblock if MCBPC indicates macroblock 
type 3 or 4. In PB frames mode, the intra-DC 
coefficient is present for every P block if 
MCBPC indicates macroblock type 3 or 4 (the 
intra-DC coefficient is not present for B 
blocks). 

Transform Coefficient (TCOEF) 

If not in PB frames mode, TCOEF is 
present if indicated by MCBPC or CBPY. In PB 
frames mode, TCOEF is present for B blocks if 
indicated by CBPB. 

An event is a combination of a last non-zero 
coefficient indication (LAST = “0” if there are 
more non-zero coefficients in the block; LAST 
= “1” if there are no more non-zero coefficients 
in the block), the number of successive zeros 
preceding the coefficient (RUN) , and the non- 
zero coefficient (LEVEL) . 

The most common events are coded using 
a variable-length code, shown in Table 10.20. 
The “s” bit indicates the sign of the level; “0” 
for positive, and “1” for negative. 
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Intra DC 


Reconstruction 


Value 


Level 


OOOO OOOO 


not used 


0000 0001 


8 


0000 0010 


16 


0000 0011 


24 


: 




0111 1111 


1016 


1111 1111 


1024 


1000 0001 


1032 






1111 1101 


2024 


1111 1110 


2032 



Table 10.19. Baseline H.263 Reconstruction Levels for Intra DC. 



Other combinations of (LAST, RUN, 
LEVEL) are encoded using a 22-bit word: 7 bits 
of escape (ESC), 1 bit of LAST, 6 bits of RUN, 
and 8 bits of LEVEL. The codes for RUN and 
LEVEL are shown in Table 10.21. Code 1000 
0000 is forbidden unless in the modified quanti- 
zation mode. 

All coefficients, except for intra-DC, have 
reconstruction levels (REC) in the range -2048 
to 2047. Reconstruction levels are recovered 
by the following equations, and the results are 
clipped. 

if LEVEL =0, REC = 0 
if QUANT = odd: 

|REC| = QUANT x (2 x |LEVEL| + 1) 
if QUANT = even: 

|REC| = QUANT x (2 x | LEVEL] + 1) - 1 



After calculation of |REC|, the sign is 
added to obtain REC. Sign (LEVEL) is specified 
by the “s” bit in the TCOEF code in Table 
10 . 20 . 

REC = sign (LEVEL) x |REC| 

For intra-DC blocks, the reconstruction 
level is: 

REC = 8 x LEVEL 
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Last 


Run 


| Level | 


Code 


0 


0 


1 


10s 








0 


0 


2 


1111 


s 






0 


0 


3 


0101 


01s 






0 


0 


4 


0010 


Ills 






0 


0 


5 


0001 


1111 


s 




0 


0 


6 


0001 


0010 


Is 




0 


0 


7 


0001 


0010 


0s 




0 


0 


8 


0000 


1000 


01s 




0 


0 


9 


0000 


1000 


00s 




0 


0 


10 


0000 


0000 


Ills 




0 


0 


11 


0000 


0000 


110s 




0 


0 


12 


0000 


0100 


000s 




0 


1 


1 


110s 








0 


1 


2 


0101 


00s 






0 


1 


3 


0001 


1110 


s 




0 


1 


4 


0000 


0011 


11s 




0 


1 


5 


0000 


0100 


001s 




0 


1 


6 


0000 


0101 


0000 


s 


0 


2 


1 


1110 


s 






0 


2 


2 


0001 


1101 


s 




0 


2 


3 


0000 


0011 


10s 




0 


2 


4 


0000 


0101 


0001 


s 


0 


3 


1 


0110 


Is 






0 


3 


2 


0001 


0001 


Is 




0 


3 


3 


0000 


0011 


01s 




0 


4 


1 


0110 


Os 






0 


4 


2 


0001 


0001 


0s 




0 


4 


3 


0000 


0101 


0010 


s 


0 


5 


1 


0101 


Is 






0 


5 


2 


0000 


0011 


00s 




0 


5 


3 


0000 


0101 


0011 


s 


0 


6 


1 


0100 


11s 






0 


6 


2 


0000 


0010 


11s 




0 


6 


3 


0000 


0101 


0100 


s 


0 


7 


1 


0100 


10s 







Table 10.20a. Baseline H.263 Variable-Length Code Table for TCOEF. 
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Last 


Run 


| Level | 


Code 


1 


8 


1 


0010 


000s 






1 


9 


1 


0001 


1010 


s 




1 


10 


1 


0001 


1001 


s 




1 


11 


1 


0001 


1000 


s 




1 


12 


1 


0001 


0111 


s 




1 


13 


1 


0001 


0110 


s 




1 


14 


1 


0001 


0101 


s 




1 


15 


1 


0001 


0100 


s 




1 


16 


1 


0001 


0011 


s 




1 


17 


1 


0000 


1100 


0s 




1 


18 


1 


0000 


1011 


Is 




1 


19 


1 


0000 


1011 


0s 




1 


20 


1 


0000 


1010 


Is 




1 


21 


1 


0000 


1010 


0s 




1 


22 


1 


0000 


1001 


Is 




1 


23 


1 


0000 


1001 


0s 




1 


24 


1 


0000 


1000 


Is 




1 


25 


1 


0000 


0001 


11s 




1 


26 


1 


0000 


0001 


10s 




1 


27 


1 


0000 


0001 


01s 




1 


28 


1 


0000 


0001 


00s 




1 


29 


1 


0000 


0100 


100s 




1 


30 


1 


0000 


0100 


101s 




1 


31 


1 


0000 


0100 


110s 




1 


32 


1 


0000 


0100 


Ills 




1 


33 


1 


0000 


0101 


1000 


s 


1 


34 


1 


0000 


0101 


1001 


s 


1 


35 


1 


0000 


0101 


1010 


s 


1 


36 


1 


0000 


0101 


1011 


s 


1 


37 


1 


0000 


0101 


1100 


s 


1 


38 


1 


0000 


0101 


1101 


s 


1 


39 


1 


0000 


0101 


1110 


s 


1 


40 


1 


0000 


0101 


1111 


s 


ESC 


0000 


Oil 







Table 10.20c. Baseline H.263 Variable-Length Code Table for TCOEF. 
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Run 


Code 


Level 


Code 


O 


0000 00 


-128 


forbidden 


1 


0000 01 


-127 


1000 0001 










63 


ini li 


-2 


1111 1110 




-1 


mi nil 


0 


forbidden 


1 


0000 0001 


2 


0000 0010 






127 


oni mi 



Table 10.21. Baseline H.263 Run, Level Codes. 



PLUSPTYPE Picture Layer Option 

PLUSTYPE is present when indicated by 
bits 6-8 of PTYPE, and is used to enable the 
H.263 version 2 options. When present, the 
PLUSTYPE and related fields immediately fol- 
low PTYPE, preceding PQUANT. 

If PLUSPTYPE is present, then CPM 
immediately follows PLUSPTYPE. If PLUSP- 
TYPE is not present, then CPM immediately 
follows PQUANT. PSBI always immediately 
follows CPM (if CPM = “1”). 

PLUSTYPE is a 12- or 30-bit codeword, 
comprised of up to three subfields: UFEP, 
OPPTYPE, and MPPTYPE. The PLUSTYPE 
and related fields are illustrated in Figure 
10 . 11 . 



Update Full Extended PTYPE (UFEP) 

UFEP is a 3-bit codeword present if 
“extended PTYPE” is indicated by PTYPE. 

A value of “000” indicates that only MPP- 
TYPE is included in the picture header. 

A value “001” indicates that both OPP- 
TYPE and MPPTYPE are included in the pic- 
ture header. If the picture type is intra or El, 
this field must be set to “001.” 

In addition, if PLUSPTYPE is present in 
each of a continuing sequence of pictures, this 
field shall be set to “001” every 5 seconds or 
every five frames, whichever is larger. UFEP 
should be set to “001” more often in error- 
prone environments. 

Values other than “000” and “001” are 
reserved. 
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Optional Part of PLUSPTYPE (OPPTYPE) 


Bit 9 


Deblocking filter (DF) mode 


This field contains features that are not 




“0” = off, “1” = on 


likely to 


be changed from one frame to 


Bit 10 


Slice-structured (SS) mode 


another. If UFEP is “001,” the following bits are 




mode 


present in OPPTYPE: 




“0” = off, “1” = on 






Bit 11 


Reference picture selection 


Bit 1-3 


Source format 




(RPS) mode 




“000” = reserved 




“0” = off, “1” = on 




“001” = SQCIF 
“010” = QCIF 
“011” = CIF 


Bit 12 


Independent segment decoding 
(ISD) mode 




“100” = 4CIF 




“0” = off, “1” = on 




“101” = 16CIF 


Bit 13 


Alternative Inter- VLC 




“110” = custom source format 




(ATV) mode 




“111” = reserved 




“0” = off, “1” = on 


Bit 4 


Custom picture clock frequency 


Bit 14 


Modified quantization (MQ) mode 




“0” = standard, “1” = custom 




“0” = off, “1” = on 


Bit 5 


Unrestricted motion vector 
(UMV) mode 
“0” = off, “1” = on 


Bit 15 
Bit 16 


“i” 

“0” 


Bit 17 


“0” 


Bit 6 


Syntax-based arithmetic coding 




(SAC) mode 
“0” = off, “1” = on 


Bit 18 


“0” 


Bit 7 


Advanced prediction (AP) mode 








“0” = off, “1” = on 






Bit 8 


Advanced intra-coding 
(AIC) mode 








“0” = off, “1” = on 








Figure 10.11. H.263 PLUSPTYPE and Related Fields. 
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Mandatory Part of PLUSPTYPE (MPPTYPE) 

Regardless of the value of UFEP, the fol- 
lowing 9 bits are also present in MPPTYPE: 

Bit 1-3 Picture code type 

“000” = I frame (intra) 

“001” = P frame (inter) 

“010” = Improved PB frame 

“Oil” = B frame 

“100” = El frame 

“101” = EP frame 

“110” = reserved 

“111” = reserved 

Bit 4 Reference picture resampling 
(RPR) mode 
“0” = off, “1” = on 

Bit 5 Reduced resolution update 
(RRU) mode 
“0” = off, “1” = on 

Bit 6 Rounding type (RTYPE) mode 

“0” = off, “1” = on 

Bit 7 “0” 

Bit 8 “0” 

Bit 9 “1” 



Custom Picture Format ( CPFMT) 

CPFMT is a 23-bit value that is present if 
the use of a custom picture format is specified 
by PLUSPTYPE and UFEP is “001.” 

Bit 1-4 Pixel aspect ratio code 
“0000” = reserved 
“0001” = 1:1 
“0010” = 12:11 
“0011” = 10:11 
“0100” = 16:11 
“0101” = 40:33 
“0110” - “1110” = reserved 
“1111” = extended PAR 



Bit 5-13 


Picture width indication (PWI) 
number of samples per line = (PWI + 
1) x4 


Bit 14 


“i” 


Bit 15-23 


Picture height indication (PHI) 
number of lines per frame = (PHI + 1) 
x 4 


Extended Pixel Aspect Ratio (EPAR) 

EPAR is a 16-bit value present if CPFMT is 
present and “extended PAR” is indicated by 
CPFMT. 


Bit 1-8 


PAR width 


Bit 9-16 


PAR height 



Custom Picture Clock Frequency Code ( CPCFC) 
CPCFC is an 8-bit value present only if 
PLUSPTYPE is present, UFEP is “001,” and 
PLUSPTYPE indicates a custom picture clock 
frequency. The custom picture clock frequency 
(in Hz) is: 

1,800,000 / (clock divisor x clock conversion factor) 

Bit 1 Clock conversion factor code 

“ 0 ” = 1000 , “ 1 ” = 1001 

Bit 2-8 Clock divisor 

Extended Temporal Reference (ETR) 

ETR is a 2-bit value present if a custom pic- 
ture clock frequency is in use. It is the two 
MSBs of the 10-bit TR value. 




504 Chapter 10: H.261 and H.263 



Unlimited Unrestricted Motion Vectors Indicator 
(UUI) 

UUI is a 1- or 2-bit variable-length value 
indicating the effective range limit of motion 
vectors. It is present if the optional unrestricted 
motion vector mode is indicated in PLUSP- 
TYPE and UFEP is “001.” 

A value of “1” indicates the motion vector 
range is limited according to Tables 10.22 and 
10.23. A value of “01” indicates the motion vec- 
tor range is not limited except by the picture 
size. 



Picture Width 


Horizontal Motion 
Vector Range 


4-352 


- 32 , + 31.5 


356-704 


- 64 , + 63.5 


708-1408 


- 128 , + 127.5 


1412-2048 


- 256 , + 255.5 



Table 10.22. Optional Horizontal Motion 
Range. 



Picture Height 


Vertical Motion 
Vector Range 


4-288 


- 32 , + 31.5 


292-576 


- 64 , + 63.5 


580-1152 


- 128 , + 127.5 



Table 10.23. Optional Vertical Motion 
Range. 



Slice Structured Submode Bits (SSS) 

SSS is a 2-bit value present only if the 
optional slice structured mode is indicated in 
PLUSPTYPE and UFEP is “001.” If the slice 
structured mode is in use but UFEP is not 
“001,” the last SSS value remains in effect. 



Bit 1 Rectangular slices 
“0” = no, “1” = yes 

Bit 2 Arbitrary slice ordering 

“0” = sequential, “1” = arbitrary 

Enhancement Layer Number (ELNUM) 

ELNUM is a Tbit value present only dur- 
ing the temporal, SNR, and spatial scalability 
mode. It identifies a specific enhancement 
layer. The first enhancement layer above the 
base layer is designated as enhancement layer 
number 2, and the base layer is number 1. 

Reference Layer Number (RLNUM) 

RLNUM is a Tbit value present only dur- 
ing the temporal, SNR, and spatial scalability 
mode UFEP is “001.” The layer number for the 
frames used as reference anchors is identified 
by the RLNUM. 

Reference Picture Selection Mode Flags 
(RPSMF) 

RPSMF is a 3-bit codeword present only 
during the reference picture selection mode and 
UFEP is “001.” When present, it indicates 
which hack-channel messages are needed by 
the encoder. If the reference picture selection 
mode is in use but RPSMF is not present, the 
last value of RPSMF that was sent remains in 
effect. 

“000” - “011” = reserved 

“100” = neither ACK nor NACK needed 

“101” = need ACK 

“110” = need NACK 

“111” = need both ACK and NACK 
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Temporal Reference for Prediction Indication 
(TRPI) 

TRPI is a 1-bit value present only during 
the reference picture selection mode. When 
present, it indicates the presence of the follow- 
ing TRP field. “0” = TRP field not present; “1” = 
TRP field present. TRPI is “0” whenever the 
picture header indicates an I frame or El 
frame. 

Temporal Reference for Prediction (TRP) 

TRP is a 10-bit value indicating the tempo- 
ral reference used for encoding prediction, 
except in the case of B frames. For B frames, 
the frame having the temporal reference speci- 
fied by TRP is used for the prediction in the 
forward direction. 

If the custom picture clock frequency is 
not being used, the two MSBs of TRP are zero 
and the LSBs contain the 8-bit TR value in the 
picture header of the reference picture. If a 
custom picture clock frequency is being used, 
TRP is a 10-bit number consisting of the con- 
catenation of ETR and TR from the reference 
picture header. 

If TRP is not present, the previous anchor 
picture is used for prediction, as when not in 
the reference picture selection mode. TRP is 
valid until the next PSC, GSC, or SSC. 

Back-Channel Message Indication (BCI) 

BCI is a 1- or 2-bit variable-length code- 
word present only during the optional reference 
picture selection mode. “1” indicates the pres- 
ence of the optional back-channel message 
(BCM) field. “01” indicates the absence or the 
end of the back-channel message field. BCM 
and BCI may be repeated when present. 

Back-Channel Message (BCM) 

The variable-length back-channel mes- 
sage is present if the preceding BCI field is set 
to “1.” 



Reference Picture Resampling Parameters 
(RPRP) 

A variable-length field present only during 
the optional reference picture resampling mode. 
This field carries the parameters of the refer- 
ence picture resampling mode. 

Optional H.263 Modes 

Unrestricted Motion Vector Mode 

In this optional mode, motion vectors are 
allowed to point outside the picture. The edge 
samples are used as prediction for the “non- 
existing” samples. The edge sample is found 
by limiting the motion vector to the last full 
sample position within the picture area. 
Motion vector limiting is done separately for 
the horizontal and vertical components. 

Additionally, this mode includes an exten- 
sion of the motion vector range so that larger 
motion vectors can be used (Tables 10.22 and 
10.23). These longer motion vectors improve 
the coding efficiency for the larger picture for- 
mats, such 4CIF or 16CIF. A significant gain is 
also achieved for the other picture formats if 
there is movement along the picture edges, 
camera movement, or background movement. 

When this mode is employed within H.263 
version 2, new reversible variable-length codes 
(RVLCs) are used for encoding the motion vec- 
tors, as shown in Table 10.24. These codes are 
single-valued, as opposed to the baseline dou- 
ble-valued VLCs. The double-valued codes 
were not popular due to limitations in their 
extensibility and their high cost of implementa- 
tion. The RVLCs are also easier to implement. 

Each row in Table 10.24 represents a 
motion vector difference in half-pixel units. 
“... Xl xo” denotes all bits following the leading 
“1” in the binary representation of the absolute 
value of the motion vector difference. The “s” 
bit denotes the sign of the motion vector differ- 
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Absolute Value of Motion Vector 
Difference in Half-Pixel Units 


Code 


0 


1 


1 


OsO 


“xq” + 2 ( 2 - 3 ) 


OxqIsO 


“x lXo ” + 4 ( 4 - 7 ) 


OxjPxqIsO 


‘^xjXq” + 8 ( 8 - 15 ) 


C^IxjPxqIsO 


“ X 3X2X 1 X()” + 16 ( 16 - 31 ) 


Ch^^lxjlxolsO 


“X4X3X2X1X0” + 32 ( 32 - 63 ) 


OX4IX3IX2IX1IX0ISO 


“X5X4X3X2X1X0” + 64 ( 64 - 127 ) 


OX5IX4IX3IX2IX4IX0ISO 


“xex 5 x 4 X3X2X 1 xo” + 128 ( 128 - 255 ) 


OX0IX5IX4IX3IX2IX4IXOISO 


“X7-X5X5X4X3X2X1 x (l ” + 256 ( 256 - 511 ) 


0x71x51x51x41x31x2 IxjIxqIsO 


“xgx 7 xgX5X 4 X3X2X4Xo” + 512 ( 512 - 1023 ) 


Oxglx 7 lX0lX5lX4lX3lX2lXilX0lsO 


“x9Xgx 7 X6X 5 X4X3X2X 1 xo” + 1024 ( 1024 - 2047 ) 


OxglXglx 7 lX5lX5lX4lX3lX2lX4lX0lsO 


“x 1 oxgxgx 7 xex 5 X4X3X2X 1 xo” + 2048 ( 2048 - 4095 ) 


Oxj0lx9lxglx 7 lx5lx5lx4lx3lx2lxjlx0lso 



Table 10.24. H.263 Reversible Variable-Length Codes for Motion Vectors. 



ence: “0” for positive and “1” for negative. The 
binary representation of the motion vector dif- 
ference is interleaved with bits that indicate if 
the code continues or ends. The “0” in the last 
position indicates the end of the code. 

RVLCs can also be used to increase resil- 
ience to channel errors. Decoding can be per- 
formed by processing the motion vectors in 
the forward and reverse directions. If an error 
is detected while decoding in one direction, 
the decoder can proceed in the reverse direc- 
tion, improving the error resilience of the bit- 
stream. In addition, the motion vector range is 
extended up to [-256, +255.5], depending on 
the picture size. 



Syntax-Based Arithmetic Coding Mode 

In this optional mode, the variable-length 
coding is replaced with arithmetic coding. The 
SNR and reconstructed pictures will be the 
same, but the bit-rate can be reduced by about 
5% since the requirement of a fixed number of 
bits for information is removed. 

The syntax of the picture, group of blocks, 
and macroblock layers remains exactly the 
same. The syntax of the block layer changes 
slightly in that any number of TCOEF entries 
may be present. 

It is worth noting that use of this mode is 
not widespread. 
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Advanced Prediction Mode 

In this optional mode, four motion vectors 
per macroblock (one for each Y block) are 
used instead of one. In addition, overlapped 
block motion compensation (OBMC) is used for 
the Y blocks of P frames. 

If one motion vector is used for a macro- 
block, it is defined as four motion vectors with 
the same value. If four motion vectors are used 
for a macroblock, the first motion vector is the 
MVD codeword and applies to Yj in Figure 
10.9. The second motion vector is the MVD 2 
codeword that applies to Y 2 , the third motion 
vector is the MVD 3 codeword that applies to 
Y 3 , and the fourth motion vector is the MVD 4 
codeword that applies to Y 4 . The motion vector 
for Cb and Cr of the macroblock is derived 
from the four Y motion vectors. 

The encoder has to decide which type of 
vector to use. Four motion vectors use more 
bits, but provide improved prediction. This 
mode improves inter-picture prediction and 
yields a significant improvement in picture 
quality for the same bit-rate by reducing block- 
ing artifacts. 

PB Frames Mode 

Like MPEG, H.263 optionally supports PB 
frames. A PB frame consists of one P frame 
(predicted from the previous P frame) and one 
B frame (bi-directionally predicted from the 
previous and current P frame), as shown in 
Figure 10.12. 

With this coding option, the picture rate 
can be increased without substantially increas- 
ing the bit-rate. However, an improved PB 
frames mode is supported in Annex M. This 
original PB frames mode is retained only for 
purposes of compatibility with systems made 
prior to the adoption of Annex M. 



Continuous Presence Multipoint and 
Video Multiplex Mode 

In this optional mode, up to four indepen- 
dent H.263 bitstreams can be multiplexed into 
a single bitstream. The sub-bitstream with the 
lowest identifier number (sent via the SBI 
field) is considered to have the highest priority 
unless a different priority convention is estab- 
lished by external means. 

This feature is designed for use in continu- 
ous presence multipoint application or other 
situations in which separate logical channels 
are not available, but the use of multiple video 
bitstreams is desired. It is not to be used with 
H.324. 

Forward Error Correction Mode 

This optional mode provides forward error 
correction (code and framing) for transmission 
of H.263 video data. It is not to be used with 
H.324. 

Both the framing and the forward error 
correction code are the same as in H.261. 

Advanced Intra-Coding Mode 

This optional mode improves compression 
for intra-macroblocks. It uses intra-block pre- 
diction from neighboring intra-blocks, a modi- 
fied inverse quantization of intra-DCT 
coefficients, and a separate VLC table for intra- 
coefficients. This mode significantly improves 
the compression performance over the intra- 
coding of baseline H.263. 

An additional 1- or 2-bit variable-length 
codeword, INTRA_MODE, is added to the 
macroblock layer immediately following the 
MCBPC field to indicate the prediction mode: 

“0” = DC only 

“10” = Vertical DC and AC 

“11” = Horizontal DC and AC 
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PB FRAME 



BI-DIRECTIONAL 

PREDICTION 




PREDICTION 



□ 

■ 



BI-DIRECTIONAL (B) FRAME 
PREDICTED (P) FRAME 



Figure 10.12. Baseline H.263 PB Frames. 



For intra-coded blocks, if the prediction 
mode is DC only, the zig-zag scan order in Fig- 
ure 7.59 is used. If the prediction mode is verti- 
cal DC and AC, the alternate-vertical scanning 
order in Figure 7.60 is used. If the prediction 
mode is horizontal DC and AC, the alternate- 
horizontal scanning order in Figure 7.61 is 
used. 

For non-intra-blocks, the zig-zag scan 
order in Figure 7.59 is used. 

Deblocking Filter Mode 

This optional mode introduces a deblock- 
ing filter inside the coding loop. The filter is 
applied to the edge boundaries of 8 x 8 blocks 
to reduce blocking artifacts. 

The filter coefficients depend on the mac- 
roblock’s quantizer step size, with larger coef- 
ficients used for a coarser quantizer. This 
mode also allows the use of four motion vec- 
tors per macroblock, as specified in the 
advanced prediction mode, and also allows 
motion vectors to point outside the picture, as 
in the unrestricted motion vector mode. The 
computationally expensive overlapping motion 
compensation operation of the advanced pre- 



diction mode is not used so as to keep the com- 
plexity of this mode minimal. 

The result is better prediction and a reduc- 
tion in blocking artifacts. 

Slice Structured Mode 

In this optional mode, a slice layer is sub- 
stituted for the GOB layer. This mode provides 
error resilience, makes the bitstream easier to 
use with a packet transport delivery scheme, 
and minimizes video delay. 

The slice layer consists of a slice header 
followed by consecutive complete macro- 
blocks. Two additional modes can be signaled 
to reflect the order of transmission (sequential 
or arbitrary) and the shape of the slices (rect- 
angular or not). These add flexibility to the 
slice structure so that it can be designed for 
different applications. 

Supplemental Enhancement I nf ormation 

With this optional mode, additional supple- 
mental information may be included in the bit- 
stream to signal enhanced display capability. 

Typical enhancement information can sig- 
nal full- or partial-picture freezes, picture 
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freeze releases, or chroma keying for video 
compositing. 

The supplemental information may be 
present in the bitstream even though the 
decoder may not be capable of using it. The 
decoder simply discards the supplemental 
information, unless a requirement to support 
the capability has been negotiated by external 
means. 

Improved PB Frames Mode 

This optional mode represents an improve- 
ment compared to the baseline H.263 PB 
frames option. This mode permits forward, 
backward, and bi-directional prediction for B 
frames in a PB frame. The operation of the 
MODB field changes are shown in Table 10.25. 

Bi-directional prediction methods are the 
same in both PB frame modes except that, in 
the improved PB frame mode, no delta vector is 
transmitted. 

In forward prediction, the B macroblock is 
predicted from the previous P macroblock, and 
a separate motion vector is then transmitted. 

In backwards prediction, the predicted 
macroblock is equal to the future P macro- 
block, and therefore no motion vector is trans- 
mitted. 

Improved PB frames are less susceptible 
to changes that may occur between frames, 
such as when there is a scene cut between the 
previous P frame and the PB frame. 

Reference Picture Selection Mode 

In baseline H.263, a frame may be pre- 
dicted from the previous frame. If a portion of 
the reference frame is lost due to errors or 
packet loss, the quality of future frames is 
degraded. Using this optional mode, it is possi- 
ble to select which reference frame to use for 
prediction, minimizing error propagation. 



Four back-channel messaging signals 
(NEITHER, ACK, NACK, and ACK+NACK) 
are used by the encoder and decoder to specify 
which picture segment will be used for predic- 
tion. For example, a NACK sent to the encoder 
from the decoder indicates that a given frame 
has been degraded by errors. Thus, the 
encoder may choose not to use this frame for 
future prediction, and instead use a different, 
unaffected, reference frame. This reduces 
error propagation, maintaining improved pic- 
ture quality in error-prone environments. 

Temporal, SNR, and Spatial Scalability 
Mode 

In this optional mode, there is support for 
temporal, SNR, and spatial scalability. Scalabil- 
ity allows for the decoding of a sequence at 
more than one quality level. This is done by 
using a hierarchy of pictures and enhancement 
pictures partitioned into one or more layers. 
The lowest layer is called the base layer. 

The base layer is a separately decodable 
bitstream. The enhancement layers can be 
decoded in conjunction with the base layer to 
increase the picture rate, increase the picture 
quality, or increase the picture size. 

Temporal scalability is achieved using bi- 
directionally predicted pictures, or B frames. 
They allow prediction from either or both a 
previous and subsequent picture in the base 
layer. This results in improved compression as 
compared to that of P frames. These B frames 
differ from the B-picture part of a PB frame or 
improved PB frame in that they are separate 
entities in the bitstream. 

SNR scalability refers to enhancement 
information that increases the picture quality 
without increasing resolution. Since compres- 
sion introduces artifacts, the difference 
between a decoded picture and the original is 
the coding error. Normally, the coding error is 
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CBPB 


MVDB 


Code 


Coding Mode 






0 


bi-directional prediction 


X 




10 


bi-directional prediction 




X 


110 


forward prediction 


X 


X 


1110 


forward prediction 






11110 


backward prediction 


X 




mm 


backward prediction 



Table 10.25. H.263 Variable-Length Code Table for MODB for Improved PB Frame Mode. 



lost at the encoder and never recovered. With 
SNR scalability, the coding errors are sent to 
the decoder, enabling an enhancement to the 
decoded picture. The extra data serves to 
increase the signal-to-noise ratio (SNR) of the 
picture, hence the term “SNR scalability.” 

Spatial scalability is closely related to SNR 
scalability. The only difference is that before 
the picture in the reference layer is used to 
predict the picture in the spatial enhancement 
layer, it is interpolated by a factor of two either 
horizontally or vertically (ID spatial scalabil- 
ity), or both horizontally and vertically (2D 
spatial scalability) . Other than the upsampling 
process, the processing and syntax for a spatial 
scalability picture is the same as for an SNR 
scalability picture. 

Since there is very little syntactical distinc- 
tion between frames using SNR scalability and 
frames using spatial scalability, the frames 
used for either purpose are called El frames 
and EP frames. 

The frame in the base layer which is used 
for upward prediction in an El or EP frame 
may be an I frame, a P frame, the P-part of a PB 
frame, or the P-part of an improved PB frame 
(but not a B frame, the B-part of a PB frame, or 
the B-part of an improved PB frame) . 



This mode can be useful for networks hav- 
ing varying bandwidth capacity. 

Reference Picture Resampling Mode 

In this optional mode, the reference frame 
is resampled to a different size prior to using it 
for prediction. 

This allows having a different source refer- 
ence format than the frame being predicted. It 
can also be used for global motion estimation, 
or estimation of rotating motion, by warping 
the shape, size, and location of the reference 
frame. 

Reduced Resolution Update Mode 

An optional mode is provided which allows 
the encoder to send update information for a 
frame encoded at a lower resolution, while still 
maintaining a higher resolution for the refer- 
ence frame, to create a final frame at the 
higher resolution. 

This mode is best used when encoding a 
highly active scene, allowing an encoder to 
increase the frame rate for moving parts of a 
scene, while maintaining a higher resolution in 
more static areas of the scene. 
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The syntax is the same as baseline H.263, 
but interpretation of the semantics is different. 
The dimensions of the macroblocks are dou- 
bled, so the macroblock data size is one-quar- 
ter of what it would have been without this 
mode enabled. Therefore, motion vectors must 
be doubled in both dimensions. To produce 
the final picture, the macroblock is upsampled 
to the intended resolution. After upsampling, 
the full resolution frame is added to the 
motion-compensated frame to create the full 
resolution frame for future reference. 

Independent Segment Decoding Mode 

In this optional mode, picture segment 
boundaries are treated as picture bound- 
aries — no data dependencies across segment 
boundaries are allowed. 

Use of this mode prevents the propagation 
of errors, providing error resilience and recov- 
ery. This mode is best used with slice layers, 
where, for example, the slices can be sized to 
match a specific packet size. 

Alternative Inter-VLC Mode 

The intra-VLC table used in the advanced 
intra-coding mode can also be used for inter- 
block coding when this optional mode is 
enabled. 

Large quantized coefficients and small 
runs of zeros, typically present in intra-blocks, 
become more frequent in inter-blocks when 
small quantizer step sizes are used. When bit 
savings are obtained, and the use of the intra 
quantized DCT coefficient table can be 
detected at the decoder, the encoder will use 
the intra-table. The decoder will first try to 
decode the quantized coefficients using the 
inter-table. If this results in addressing coeffi- 
cients beyond the 64 coefficients of the 8x8 
block, the decoder will use the intra-table. 



Modified Quantization Mode 

This optional mode improves the bit-rate 
control for encoding, reduces CbCr quantiza- 
tion error, expands the range of DCT coeffi- 
cients, and places certain restrictions on 
coefficient values. 

In baseline H.263, the quantizer value may 
be modified at the macroblock level. However, 
only a small adjustment (+1 or ±2) in the value 
of the most recent quantizer is permitted. The 
modified quantization mode allows the modifi- 
cation of the quantizer to any value. 

In baseline H.263, the Y and CbCr quantiz- 
ers are the same. The modified quantization 
mode also increases CbCr picture quality by 
using a smaller quantizer step size for the Cb 
and Cr blocks relative to the Y blocks. 

In baseline H.263, when a quantizer 
smaller than eight is employed, quantized coef- 
ficients exceeding the range of [-127, +127] 
are clipped. The modified quantization mode 
also allows coefficients that are outside the 
range of [-127, +127] to be represented. There- 
fore, when a very fine quantizer step size is 
selected, an increase in Y quality is obtained. 

Enhanced Reference Picture Selection 
Mode 

An optional Enhanced Reference Picture 
Selection (ERPS) mode offers enhanced coding 
efficiency and error resilience. It manages a 
multi-picture buffer of stored pictures. 

Data-Partitioned Slice Mode 

An optional Data-Partitioned Slice (DPS) 
mode offers enhanced error resilience. It sepa- 
rates the header and motion vector data from 
the DCT coefficient data and protects the 
motion vector data by using a reversible repre- 
sentation. 
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Additional Supplemental Enhancement 
Information Specification 

An optional Additional Supplemental 
Enhancement Information Specification pro- 
vides backward-compatible enhancements, 
such as: 

(a) Indication of using a specific fixed-point 

IDCT 

(b) Picture Messages, including message types: 

• Arbitrary binary data 

• Text (arbitrary, copyright, caption, 
video description or Uniform Re- 
source Identifier) 

• Picture header repetition (current, 
previous, next with reliable temporal 
reference or next with unreliable tem- 
poral reference) 

• Interlaced field indications (top or 
bottom) 

• Spare reference picture identification 

Profiles 

Profiles specify the syntax (i.e., algo- 
rithms) for common application-specific config- 
urations. 

Profile 0 

The Baseline Profile, or Profile 0, uses no 
optional modes of operation. 



Profile 1 

Profile 1 (H.320 coding efficiency version 2 
backward-compatibility profile) provides com- 
patibility with H.242 and H.320. It is comprised 
of Profile 0 plus the following modes: 

• Advanced Intra-Coding 

• Deblocking Filter 

• Supplemental Enhancement Informa- 
tion: full-picture freeze 

• Modified Quantization 

Profile 2 

Profile 2 (version 1 backward-compatibility 
profile) provides enhanced coding efficiency 
for the first version of H.263. It is comprised of 
Profile 0 plus the following modes: 

• Advanced Prediction 

Profile 3 

Profile 3 (version 2 interactive and stream- 
ing wireless profile) provides enhanced coding 
efficiency performance and enhanced error 
resilience for wireless devices. It is comprised 
of Profile 0 plus the following modes: 

• Advanced Intra-Coding 

• Deblocking Filter 

• Slice Structured 

• Modified Quantization 
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Profile 4 

Profile 4 (version 3 interactive and stream- 
ing wireless profile) provides enhanced coding 
efficiency performance and enhanced error 
resilience for wireless devices. It is comprised 
of Profiles 0 and 3 plus the following modes: 

• Data-Partitioned Slice 

• Supplemental Enhancement Informa- 
tion: previous picture header repeti- 
tion 

Profile 5 

Profile 5 (conversational high compression 
profile) provides enhanced coding efficiency 
without adding the delay associated with the 
use of B pictures and without adding error 
resilience features. It is comprised of Profiles 
0, 1, and 2 plus the following modes: 

• Unrestricted Motion Vectors: UUI = 

“ 1 ” 

• Enhanced Reference Picture Selec- 
tion 

Profile 6 

Profile 6 (conversational Internet profile) 
provides enhanced coding efficiency perfor- 
mance without adding the delay associated 
with the use of B pictures, and adding some 
error resilience suitable for use on Internet 
Protocol (IP) networks. It is comprised of Pro- 
files 0 and 5 plus the following modes: 

• Slice Structured with Arbitrary Slice 
Ordering (ASO) 

• Supplemental Enhancement Informa- 
tion: previous picture header repeti- 
tion 



Profile 7 

Profile 7 (conversational interlace profile) 
provides enhanced coding efficiency perfor- 
mance for low-delay applications, plus support 
of interlaced video sources. It is comprised of 
Profiles 0 and 5 plus the following modes: 

• Supplemental Enhancement Informa- 
tion: interlaced field indications for 
240-line and 288-line pictures 

Profile 8 

Profile 8 (high latency profile) provides 
enhanced coding efficiency performance for 
applications without critical delay constraints. 
It is comprised of Profiles 0 and 6 plus the fol- 
lowing modes: 

• Reference Picture Resampling 

• Temporal Scalability: B pictures 

Levels 

Levels specify various parameters (resolu- 
tion, frame rate, bit-rate, etc.) within a profile. 

Level 10 

Support up to 176x144 resolution, up to 64 
kbps. 

Level 20 

Support up to 352x288 resolution, up to 
128 kbps. 

Level 30 

Support up to 352x288 resolution, up to 
384 kbps. 
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Level 40 

Support up to 352x288 resolution, up to 2 
Mbps. 

Level 45 

Support up to 176x144 resolution, up to 
128 kbps. 

Level 50 

Support up to 352x288 resolution, up to 4 
Mbps. 

Level 60 

Support up to 720x288 resolution, up to 8 
Mbps. 

Level 70 

Support up to 720x576 resolution, up to 16 
Mbps. 
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Consumer DV 



The DV (digital video) format is used by 
tape-based digital camcorders, and is based on 
IEC 61834 (25 Mbps bit-rate) and the newer 
SMPTE 314M and 370M specifications (25, 50, 
or 100 Mbps bit-rate). The compression algo- 
rithm used is neither motion-JPEG nor MPEG, 
although it shares much in common with 
MPEG I frames. A proprietary compression 
algorithm is used that can be edited since it is 
an intra-frame technique. 

The digitized video is stored in memory 
before compression is done. The correlation 
between the two fields stored in the buffer is 
measured. If the correlation is low, indicating 
inter-field motion, the two fields are individu- 
ally compressed. Normally, the entire frame is 
compressed. In either case, DCT-based com- 
pression is used. 

To achieve a constant 25, 50, or 100 Mbps 
bit-rate, DV uses adaptive quantization, which 
uses the appropriate DCT quantization table 
for each frame. 



Figure 11.1 illustrates the contents of one 
track as written on tape. The ITI sector (insert 
and track information) contains information on 
track status and serves in place as a conven- 
tional control track during video editing. 

The audio sector ; shown in Figure 11.2, 
contains both audio data and auxiliary audio 
data (AAUX). 

The video sector ; shown in Figure 11.3, con- 
tains video data and auxiliary video data 
(VAUX). VAUX data includes recording date 
and time, lens aperture, shutter speed, color 
balance, and other camera settings. 

The subcode sector stores a variety of infor- 
mation, including timecode, teletext, closed 
captioning in multiple languages, subtitles and 
karaoke lyrics in multiple languages, titles, 
table of contents, chapters, etc. The subcode 
sector, AAUX data, and VAUX data use 5-byte 
blocks of data called packs. 
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Figure 11.1. Sector Arrangement for One Track for a 480i System. The total bits per track, 
excluding the overwrite margin, is 134,975 (134,850). There are 10 (12) of these tracks per 
video frame. 576i system parameters (if different) are shown in parentheses. 



SYNC 

BLOCK 

NUMBER 



SYNC 
2 BYTES 


ID 

3 BYTES 


AAUX DATA 
5 BYTES 


AUDIO DATA 
72 BYTES 


INNER 
PARITY 
8 BYTES 


SYNC 
2 BYTES 


ID 

3 BYTES 


AAUX DATA 
5 BYTES 


AUDIO DATA 
72 BYTES 


INNER 
PARITY 
8 BYTES 



SYNC 


ID 


AAUX DATA 


AUDIO DATA 


INNER 
PARITY 
8 BYTES 


2 BYTES 


3 BYTES 


5 BYTES 


72 BYTES 


SYNC 
2 BYTES 


ID 

3 BYTES 


OUTER PARITY 


INNER 
PARITY 
8 BYTES 



SYNC 
2 BYTES 


ID 

3 BYTES 


OUTER PARITY 


INNER 
PARITY 
8 BYTES 



Figure 11.2. Structure of Sync Blocks in an Audio Sector. 
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Figure 11.3. Structure of Sync Blocks in a Video Sector. 



Audio 

An audio frame starts with an audio sample 
within -50 samples of the beginning of line 1 
(480i systems) or the middle of line 623 (576i 
systems) . 

Each track contains nine audio sync 
blocks, with each audio sync block containing 
5 bytes of audio auxiliary data (AAUX) and 72 
bytes of audio data, as illustrated in Figure 
11.2. Audio samples are shuffled over tracks 
and data-sync blocks within a frame. The 
remaining five audio sync blocks are used for 
error correction. 



Two 44.1 kHz, 16-bit channels require a 
data rate of about 1.64 Mbps. Four 32 kHz, 12- 
bit channels require a data rate of about 1.536 
Mbps. Two 48 kHz, 16-bit channels require a 
data rate of about 1.536 Mbps. 

IEC 61834 

IEC 61834 supports a variety of audio sam- 
pling rates: 

48 kHz (16 bits, 2 channels) 

44.1 kHz (16 bits, 2 channels) 

32 kHz (16 bits, 2 channels) 

32 kHz (12 bits, 4 channels) 
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Audio sampling may be either locked or 
unlocked to the video frame frequency. 

Audio data is processed in frames. At a 
locked 48 kHz sample rate, each frame con- 
tains either 1600 or 1602 audio samples (480i 
system) or 1920 audio samples (576i system). 
For the 480i system, the number of audio sam- 
ples per frame follows a five-frame sequence: 

1600, 1602, 1602, 1602, 1602, 1600, ... 

With a locked 32 kHz sample rate, each 
frame contains either 1066 or 1068 audio sam- 
ples (480i system) or 1280 audio samples (576i 
system). For the 480i system, the number of 
audio samples per frame follows a fifteen-frame 
sequence: 

1066, 1068, 1068, 1068, 1068, 1068, 1068, 1066, 

1068, 1068, 1068, 1068, 1068, 1068, 1068, ... 

For unlocked audio sampling, there is no 
exact number of audio samples per frame, 
although minimum and maximum values are 
specified. 

SMPTE 314M/370M 

SMPTE 314M and 370M support a more 
limited option, with audio sampling locked to 
the video frame frequency: 

48 kHz (16 bits, 2 channels) for 25 Mbps 

48 kHz (16 bits, 4 channels) for 50 Mbps 

48 kHz (16 bits, 8 channels) for 100 Mbps 

Audio data is processed in frames. At a 
locked 48 kHz sample rate, each frame con- 
tains either 1600 or 1602 audio samples (60- 
field/ frame system) or 1920 audio samples (50- 
field/ frame system). For the 60-field/frame 
system, the number of audio samples per 
frame follows a five-frame sequence: 



1600, 1602, 1602, 1602, 1602, 1600, ... 

The audio capacity is capable of 1620 sam- 
ples per frame for the 60-field/frame system or 
1944 samples per frame for the 50-field/frame 
system. The unused space at the end of each 
frame is filled with arbitrary data. 

Audio Auxiliary Data (AAUX) 

AAUX information is added to the shuffled 
audio data as shown in Figure 11.2. The AAUX 
pack includes a 1-byte pack header and four 
bytes of data (payload), resulting in a 5-byte 
AAUX pack. Since there are nine of them per 
video frame, they are numbered from 0 to 8. 
An AAUX source (AS) pack and an AAUX 
source control (ASC) pack must be included in 
the compressed stream. Only the AS and ASC 
packs are currently supported by SMPTE 
314M and 370M, although IEC 61834 supports 
many other pack formats. 

AAUX Source (AS) Pack 

The format for this pack is shown in Table 

11 . 1 . 

LF Locked audio sample rate 

“0” = locked to video 
“1” = unlocked to video 

AF Audio frame size. Specifies the 

number of audio samples per 
frame 

SM Stereo mode 

“0” = multi-stereo audio 
“1” = lumped audio 
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Table 11.1. AAUX Source (AS) Pack. 



PA 


Specifies if the audio signals 


ML 


Multi-language flag 




recorded in CHI (CH3) are 




“0” = recorded in multi-language 




related to the audio signals 




“1” = not recorded in multi-language 




recorded in CH2 (CH4) 
“0” = one of pair channels 
“1” = independent channels 


50/60 


50- or 60-field system 
“0” = 60-field system 
“1” = 50-field system 


CHN 


Number of audio channels within 

an audio block 

“00” = one channel per block 

“01” = two channels per block 

“10” = reserved 

“11” = reserved 


STYPE 


For SMPTE 314M/370M, speci- 
fies the number of audio blocks 






per frame 

“00000” = 2 audio blocks 
“00001” = reserved 
“00010” = 4 audio blocks 


AM 


Specifies the content of the audio 
signal on each channel 




“00011” = 8 audio blocks 
“00100” to “11111” = reserved 






520 Chapter 1 1: Consumer DV 



For IEC 61834, specifies the 

video system 

“00000” = standard-definition 

“00001” = reserved 

“00010” = high-definition 

“00011” to “11111” = reserved 

EF Audio emphasis flag 

“0” = on 
“1” = off 

TC Emphasis time constant 

“1” = 50/15 ps 
“0” = reserved 

SMP Audio sampling frequency 

“000” = 48 kHz 
“001” = 44.1 kHz 
“010” = 32 kHz 
“011” to “111” = reserved 

QU Audio quantization 

“000” = 16 bits linear 
“001” = 12 bits nonlinear 
“010” = 20 bits linear 
“011” to “111” = reserved 

AAUX Source Control (ASC) Pack 

The format for this pack is shown in Table 

11 . 2 . 

CGMS Copy generation management 
system 

“00” = copying permitted without 
restriction 
“01” = reserved 
“10” = one copy permitted 
“11” = no copy permitted 

ISR Previous input source 
“00” = analog input 
“01” = digital input 
“10” = reserved 
“11” = no information 



CMP Number of times of compression 

“00” = once 
“01” = twice 
“10” = three or more 
“11” = no information 

SS Source and recorded situation 

“00” = scrambled source with audience 
restrictions and recorded 
without descrambling 
“01” = scrambled source without 
audience restrictions and 
recorded without descrambling 
“10” = source with audience 

restrictions or descrambled 
source with audience 
restrictions 
“11” = no information 

EFC Audio emphasis flags 
“00” = emphasis off 
“01” = emphasis on 
“10” = reserved 
“11” = reserved 

REC S Recording start point 

“0” = at recording start point 
“1” = not at recording start point 

REC E Recording end point 

“0” = at recording end point 
“1” = not at recording end point 

REC M Recording mode 
“001” = original 
“011” = one CH insert 
“100” = four CHs insert 
“101” = two CHs insert 
“111” = invalid recording 

FADE S Fading of recording start point 
“0” = fading off 
“1” = fading on 
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Table 11.2. AAUX Source Control (ASC) Pack. 



FADEE 


Fading of recording end point 


DRF 


Direction flag 




“0” = fading off 




“0” = reverse direction 




“1” = fading on 




“1” = forward direction 


ICH 


Insert audio channel 


SPD 


Playback speed 




“000” = CHI 
“001” = CH2 


GEN 


Indicates the category of the 




“010” = CH3 

“Oil” = CH4 

“100” = CHI, CH2 

“101” = CH3, CH4 

“110” = CHI, CH2, CH3, CH4 

“111” = no information 


Video 


audio source 



As shown in Table 11.3, IEC 61834 uses 
4:1:1 YCbCr for 720 x 480i video (Figure 3.5) 
and 4:2:0 YCbCr for 720 x 576i video (Figure 
3.11). 






522 Chapter 1 1: Consumer DV 



SMPTE 314M uses 4:1:1 YCbCr (Figure 
3.5) for both video standards for the 25 Mbps 
implementation. 4:2:2 YCbCr (Figure 3.3) is 
used for the 50 and 100 Mbps implementa- 
tions. 

DCT Blocks 

The Y, Cb, and Cr samples for one frame 
are divided into 8x8 blocks, called DCT 
blocks. Each DCT block, with the exception of 
the right-most DCT blocks for Cb and Cr dur- 
ing 4:1:1 mode, transform 8 samples x 8 lines 
of video data. Rows 1, 3, 5, and 7 of the DCT 
block process field 1, while rows 0, 2, 4, and 6 
process field 2. 

For 480i systems, there are either 10,800 
(4:2:2) or 8100 (4:1:1) DCT blocks per video 
frame. 

For 576i systems, there are either 12,960 
(4:2:2) or 9720 (4:1:1, 4:2:0) DCT blocks per 
video frame. 

Macroblocks 

As shown in Figure 11.4, each macroblock 
in the 4:2:2 mode consists of four DCT blocks. 
As shown in Figures 11.5 and 11.6, each 
macroblock in the 4:1:1 and 4:2:0 modes con- 
sists of six DCT blocks. 

For 480i systems, the macroblock arrange- 
ment for one frame of 4:1:1 and 4:2:2 YCbCr 
data is shown in Figures 11.7 and 11.8, respec- 
tively. There are either 2700 (4:2:2) or 1350 
(4:1:1) macroblocks per video frame. 

For 576i systems, the macroblock arrange- 
ment for one frame of 4:2:0, 4:1:1, and 4:2:2 
YCbCr data is shown in Figures 11.9, 11.10, 
and 11.11, respectively. There are either 3240 
(4:2:2) or 1620 (4:1:1, 4:2:0) macroblocks per 
video frame. 



Superblocks 

Each superblock consists of 27 macrob- 
locks. 

For 480i systems, the superblock arrange- 
ment for one frame of 4:1:1 and 4:2:2 YCbCr 
data is shown in Figures 11.7 and 11.8, respec- 
tively. There are either 100 (4:2:2) or 50 (4:1:1) 
superblocks per video frame. 

For 576i systems, the superblock arrange- 
ment for one frame of 4:2:0, 4:1:1, and 4:2:2 
YCbCr data is shown in Figures 11.9, 11.10, 
and 11.11, respectively. There are either 120 
(4:2:2) or 60 (4:1:1, 4:2:0) superblocks per 
video frame. 

Compression 

Dike MPEG and H.263, DV uses DCT- 
based video compression. However, in this 
case, DCT blocks are comprised from two 
fields, with each field providing samples from 
four scan lines and eight horizontal samples. 

Two DCT modes, called 8-8-DCT and 2-4-8- 
DCT, are available for the transform process, 
depending upon the degree of content varia- 
tion between the two fields of a video frame. 
The 8-8-DCT is your normal 8x8 DCT, and is 
used when there a high degree of correlation 
(little motion) between the two fields. The 2-4- 
8-DCT uses two 4x8 DCTs (one for each 
field) , and is used when there is a low degree 
of correlation (lots of motion) between the two 
fields. Which DCT is used is stored in the DC 
coefficient area using a single bit. 

The DCT coefficients are quantized to 9 
bits, then divided by a quantization number so 
as to limit the amount of data in one video seg- 
ment to five compressed macroblocks. 




Video 523 



Each DCT block is classified into one of 
four classes based on quantization noise and 
maximum absolute values of the AC coeffi- 
cients. The 2-bit class number is stored in the 
DC coefficient area. 

An area number is used for the selection of 
the quantization step. The area number, of 
which there are four, is based on the horizontal 
and vertical frequencies. 

The quantization step is decided by the 
class number, area number, and quantization 
number (QNO). Quantization information is 
passed in the DIF header of video blocks. 

Variable-length coding converts the quan- 
tized AC coefficients to variable-length codes. 

Figures 11.12 and 11.13 illustrate the 
arrangement of compressed macroblocks. 

Video Auxiliary Data (VAUX) 

VAUX information is added to the shuffled 
video data as shown in Figure 11.3. The VAUX 
pack includes a l-byte pack header and 4 bytes 
of data (payload), resulting in a 5-byte VAUX 



pack. Since there are 45 of them per video 
frame, they are numbered from 0 to 44. A 
VAUX source (VS) pack and a VAUX source 
control (VSC) pack must be included in the 
compressed stream. Only the VS and VSC 
packs are currently supported by SMPTE 
314M, although IEC 61834 supports many 
other pack formats. 

VAUX Source (VS) Pack 

The format for this pack is shown in Table 
11.4. 

TVCH The number of the television 

channel, from 0-999. A value of 
OxEEE is reserved for pre- 
recorded tape or a line input. A 
value of OxFFF is reserved for 
“no information.” 

B/W Black and white flag 

“0” = black and white video 
“1” = color video 



Parameters 


480i System 


576i System 


720p System 


1080i System 


active resolution (Y) 


720 x 480i 


720 x 576i 


1280 x 720p 


1920 x 1080i 


frame rate 


29.97 Hz 


25 Hz 


50 Hz 
59.94 Hz 


25 Hz 
29.97 Hz 


YCbCr sampling structure 
IEC 61834 
SMPTE 314M 
SMPTE 370M 


4:1:1 

4:1:1, 4:2:2 


4:2:0 

4:1:1, 4:2:2 


4:2:2 


4:2:2 


form of YCbCr coding 


Uniformly quantized PCM, 
8 bits per sample. 


Uniformly quantized PCM, 
10 bits per sample. 


active line numbers 


23-262, 285-524 


23-310, 335-622 


26-745 


21-560, 584-1123 



Table 11.3. IEC 61834, SMPTE 314M, and SMPTE 370M YCbCr Parameters. 
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Figure 11.4. 4:2:2 Macroblock Arrangement. 
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LEFT RIGHT 

FOR RIGHT-MOST MACROBLOCK 



Figure 11.5. 4:1:1 Macroblock Arrangement. 
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Figure 11.6. 4:2:0 Macroblock Arrangement. 
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Figure 11.7. Relationship Between Superblocks and Macroblocks (4:1:1 YCbCr, 720 x 480i) 
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Figure 11.8. Relationship Between Superblocks and Macroblocks (4:2:2 YCbCr, 720 x 480i) 



Video 527 



576 

LINES 



720 SAMPLES 



SUPERBLOCK 

0/1 2 3 4 



0 


so,o / 


S0,1 


SO, 2 


SO, 3 


SO, 4 


1 


S1,0 


S1,1 


SI ,2 


SI ,3 


SI, 4 


2 


S2,0 


S2,1 


S2,2 


S2,3 


S2,4 


3 


S3,0 


S3,1 


S3, 2 


S3, 3 


S3, 4 


4 


S4,0 


S4,1 


S4,2 


S4,3 


S4,4 


5 


S5,0 


S5,1 


S5,2 


S5,3 


S5,4 


6 


S6,0 


S6,1 


S6,2 


S6,3 


S6,4 


7 


S7,0 


S7,1 


S7,2 


S7,3 


S7,4 


8 


S8,0 


S8,1 


S8,2 


S8,3 


S8,4 


9 


S9,0 


S9,1 


S9,2 


S9,3 


S9,4 


10 


S10,0 


S10,1 


S10,2 


S10,3 


SI 0,4 


11 


S1 1,0 


S1 1,1 


S1 1,2 


S1 1,3 


S1 1,4 




MACROBLOCK 



Figure 11.9. Relationship Between Superblocks and Macroblocks (4:2:0 YCbCr, 720 x 576i) 
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Figure 11.10. Relationship Between Superblocks and Macroblocks (4:1:1 YCbCr, 720 x 576i) 
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Figure 11.11. Relationship Between Superblocks and Macroblocks (4:2:2 YCbCr, 720 x 576i) 
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COMPRESSED MACROBLOCK 
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Figure 11.12. 4:2:2 Compressed Macroblock Arrangement. 
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Figure 11.13. 4:2:0 and 4:1:1 Compressed Macroblock Arrangement, 




Table 11.4. VAUX Source (VS) Pack. 
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EN CLF valid flag 

“0” = CLF is valid 
“1” = CLF is invalid 

CLF Color frames identification code 

For 4801 systems: 

“00” = color frame A 
“01” = color frame B 
“10” = reserved 
“11” = reserved 

For 576i systems: 

“00” = 1st, 2nd field 
“01” = 3rd, 4th field 
“10” = 5th, 6th field 
“11” = 7th, 8th field 

SRC Defines the input source of the 
video signal 



For IEC 61834, specifies the 

video system 

“00000” = standard-definition 

“00001” = reserved 

“00010” = high-definition 

“00011” to “11111” = reserved 

TUN Tuner Category consists of 3-bit 
area number and a 5-bit satellite 
number. “11111111” indicates no 
information is available. 

Vise “10001000” = -180 

“ 00000000 ” = 0 

“01111000” = 180 
“01111111” = no information 
other values = reserved 



50/ 60 Same as for AAUX 

STYPE For SMPTE 314M, specifies the 
video signal type 
“00000" = 4:1:1 compression 
“00001” = reserved 

“00011” = reserved 
“00100” = 4:2:2 compression 
“00101” = reserved 

“11111” = reserved 

For SMPTE 370M, specifies the 
video signal type 
“00000” = reserved 

“10011” = reserved 
“10100” = 1080i30 or 1080i25 
“10101” = 1035i30 
“10110” = reserved 
“10111” = reserved 
“11000” = 720p60 or 720p50 
“11001” = reserved 



VAUX Source Control (VSC) Pack 

The format for this pack is shown in Table 
11.5. 



CGMS 


Same as for AAUX 


ISR 


Same as for AAUX 


CMP 


Same as for AAUX 


SS 


Same as for AAUX 


RECS 


Same as for AAUX 


RECM 


Recording mode 

“00” = original 

“01” = reserved 

“10” = insert 

“11” = invalid recording 



11111” = reserved 
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0 
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1 


1 


1 



Table 11.5. VAUX Source Control (VSC) Pack. 



BCS Broadcast system. Indicates the 
type of information of display 
format with DISP 
“00” = type 0 (IEC 61880, CEA-608) 
“01” = type 1 (ETS 300 294) 

“10” = reserved 
“11” = reserved 

DISP Aspect ratio information 



FF Frame/Field flag. Indicates 

whether both fields/ frames are 
output in order or only one of 
them is output twice during one/ 
two frame period. 

“0” = one field/frame output twice 
“1” = both fields/frames output in 
order 

FS First/Second flag. Indicates 

which field/frame should be 
output during field/frame 1 
period. 

“0” = field/frame 2 
“1” = field/frame 1 
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FC Frame change flag. Indicates if 
the picture of the current frame is 
the same picture of the 
immediate previous one/two 
frames. 

“0” = same picture 
“1” = different picture 

IL Interlace flag. Indicates if the data 
of two fields which construct one 
frame are interlaced or non- 
interlaced. 

“0” = noninterlaced 

“1” = interlaced or unrecognized 

SF Still-field picture flag. Indicates 

the time difference between the 
two fields within a frame. 

“0” = 0 seconds 

“1” = 1001/60 or 1/50 second 

SC Still camera picture flag 

“0” = still camera picture 
“1” = not still camera picture 

GEN Indicates the category of the 

video source 



Digital Interfaces 

IEC 61834 and SMPTE 314M both specify 
the data format for a generic digital interface. 
This data format may be sent via IEEE 1394 or 
SDTI, for example. Figure 11.14 illustrates the 
frame data structure. 

Each of the 720 x 480i 4:1:1 YCbCr frames 
are compressed to 103,950 bytes. Including 
overhead and audio increases the amount of 
data to 120,000 bytes. 

The compressed 720 x 480i frame is 
divided into ten DIF (data in frame) 
sequences. Each DIF sequence contains 150 
DIF blocks of 80 bytes each, used as follows: 



135 DIF blocks for video 

9 DIF blocks for audio 

6 DIF blocks used for Header, Subcode, and 
Video Auxiliary (VAUX) information 

Figure 11.14 illustrates the DIF sequence 
structure in detail. Each video DIF block con- 
tains 80 bytes of compressed macroblock data: 

3 bytes for DIF block ID information 

1 byte for the header that includes the quantiza- 
tion number (QNO) and block status (STA) 

14 bytes each for Y0, Yl, Y2, and Y3 

10 bytes each for Cb and Cr 

720 x 576i frames may use either the 4:2:0 
YCbCr format (IEC 61834) or the 4:1:1 YCbCr 
format (SMPTE 314M), and require 12 DIF 
sequences. Each 720 x 576i frame is com- 
pressed to 124,740 bytes. Including overhead 
and audio increases the amount of data to 
144,000 bytes, requiring 300 packets to trans- 
fer. 

Note that the organization of data trans- 
ferred over the interface differs from the 
actual DV recording format since error correc- 
tion is not required for digital transmission. In 
addition, although the video blocks are num- 
bered in sequence in Figure 11.15, the 
sequence does not correspond to the left-to- 
right, top-to-bottom transmission of blocks of 
video data. Compressed macroblocks are shuf- 
fled to minimize the effect of errors and aid in 
error concealment. Audio data is also shuffled. 
Data is transmitted in the same shuffled order 
as recorded. 

To illustrate the video data shuffling, DV 
video frames are organized as superblocks, 
with each superblock being composed of 27 
compressed macroblocks, as shown in Figures 
11.7 through 11.11. A group of 5 superblocks 
(one from each superblock column) make up 
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1 FRAME IN 1.001 /30 SECOND (10 DIF SEQUENCES) 



DIFS0 


DIFS1 


DIFS2 


DIFS3 


DIFS4 


DIFS5 


DIFS6 


DIFS7 


DIFS8 


DIFS9 



1 DIF SEQUENCE IN 1.001 /300 SECOND (150 DIF BLOCKS) 



HEADER 
(1 DIF) 


SUBCODE 
(2 DIF) 


VAUX 
(3 DIF) 


135 VIDEO AND 9 AUDIO DIF BLOCKS 






150 DIF 


BLOCKS IN 1.001 /30 SECOND 



DIF0 


DIF1 


DIF2 


DIF3 


DIF4 


DIF5 


DIF6 




DIF148 


DIF149 



1 DIF BLOCK IN 1.001 /45000 SECOND 



ID 


HEADER 


DATA 


(3 BYTES) 


(1 BYTE) 


(76 BYTES) 



Y0 (14 BYTES) Y1 (14 BYTES) Y2 (14 BYTES) _ Y3 (14 BYTES) CR (10 BYTES) _ CB (10 BYTES) 





DC0 




AC 


DC1 




AC 


DC2 




AC 


DC3 




AC 


DC4 




AC 


DC5 




AC 












COMPRESSED 


MACROBLOCK 



















Figure 11.14. Packet Formatting for 25 Mbps 4:1:1 YCbCr 720 x 480i Systems. 



one DIF sequence. Tables 11.6 and 11.7 illus- 
trate the transmission order of the DIF blocks. 

For the 50 Mbps SMPTE 314M format, 
each compressed 720 x 480i or 720 x 576i 
frame is divided into two channels. Each chan- 
nel uses either ten (480i systems) or twelve 
DIF sequences (576i systems). 

IEEE 1394 

Using the IEEE 1394 interface for transfer- 
ring DV information is discussed in Chapter 6. 



SDTI 

The general concept of SDTI is discussed 
in Chapter 6. 

SMPTE 314M Data 

SMPTE 22 1M details how to transfer 
SMPTE 314M DV data over SDTI. 



IEC 61834 Data 

SMPTE 222M details how to transfer IEC 
61834 DV data over SDTI. 
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H = HEADER SECTION 




134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 



Figure 11.15. DIF Sequence Detail. 



100 Mbps DV Differences 

The 100 Mbps SMPTE 370M format sup- 
ports 1920 x 1080i and 1280 x 720p sources. 

1920 x 1080i sources are scaled to 1280 x 
1080i. 1280 x 720p sources are scaled to 960 x 
720p. 4:2:2 YCbCr sampling is used. 

Each compressed frame is divided into 
four channels. Each channel uses either ten 
(1080130 or 720p60 systems) or twelve DIF 
sequences (1080i25 or 720p50 systems). 



HDV Format 

Developed by Canon, Sharp, Sony and 
JVC, HDV supports recording 1920 x 1080 and 
1280 x 720 content, using a standard DV tape. 



Based on 25Mbps MPEG-2 packetized elemen- 
tary streams and 19Mbps MPEG-2 transport 
streams, video compression uses MPEG-2 and 
audio compression uses MPEG-1 Layer II. 



AVCHD Format 

Developed by Panasonic and Sony, 
AVCHD supports recording 1920 x 1080, 1280 
x 720, 720 x 480 and 720 x 480 content, using 
8cm DVD-RW, 8cm BD-R/RE, SD Memory 
Card or HDD instead of tape. Based on 
24Mbps MPEG-2 transport streams, video 
compression uses MPEG-4.10 (H.264) and 
audio compression uses Dolby® Digital or 
LPCM. 
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DIF 

Sequence 

Number 


Video 

DIF 

Block 

Number 


Compressed 

Macroblock 


DIF 

Sequence 

Number 


Video 

DIF 

Block 

Number 


Compressed 

Macroblock 


Superblock 

Number 


Macroblock 

Number 


Superblock 

Number 


Macroblock 

Number 


0 


0 


2,2 


0 




1 


6, 1 


0 


n-1 


0 


1,2 


0 


2 


8,3 


0 


1 


5, 1 


0 


3 


0,0 


0 


2 


7,3 


0 


4 


4,4 


0 


3 


n-1, 0 


0 




4 


3,4 


0 


133 


0,0 


26 




134 


4,4 


26 


133 


n-1, 0 


26 


1 


0 


3,2 


0 


134 


3,4 


26 


1 


7, 1 


0 




2 


9,3 


0 


3 


1,0 


0 


4 


5,4 


0 




133 


1,0 


26 


134 


5,4 


26 



Note: 

1. n = 10 for 480i systems, n = 12 for 576i systems. 



Table 11.6. Video DIF Blocks and Compressed Macroblocks for 25 Mbps (4:1:1 or 4:2:0 YCbCr). 
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DIF 

Sequence 

Number 


Video 

DIF 

Block 

Number 


Compressed 

Macroblock 


DIF 

Sequence 

Number 


Video 

DIF 

Block 

Number 


Compressed 

Macroblock 


Superblock 

Number 


Macroblock 

Number 


Superblock 

Number 


Macroblock 

Number 


0 


0,0 


4,2 


0 




0, 1 


5,2 


0 


n-1 


0,0 


2,2 


0 


1,0 


12, 1 


0 


0, 1 


3,2 


0 


1, 1 


13, 1 


0 


1,0 


10, 1 


0 


2,0 


16,3 


0 


1, 1 


11, 1 


0 




2,0 


14,3 


0 


134,0 


8,4 


26 




134, 1 


9,4 


26 


134,0 


6,4 


26 


1 


0,0 


6,2 


0 


134,1 


7,4 


26 


0, 1 


7,2 


0 




1,0 


14, 1 


0 


1, 1 


15, 1 


0 


2,0 


18,3 


0 




134,0 


10,4 


26 


134, 1 


11,4 


26 



Note : 

1. n = 10 for 480i systems, n = 12 for 576i systems. 



Table 11.7. Video DIF Blocks and Compressed Macroblocks for 50 Mbps (4:2:2 YCbCr). 
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MPEG-1 



MPEG-l audio and video compression was 
developed for storing and distributing digital 
audio and video. Features include random 
access, fast forward, and reverse playback. 
MPEG-1 is used as the basis for the original 
video CDs (VCD). 

The channel bandwidth and image resolu- 
tion were set by the available media at the time 
(CDs) . The goal was playback of digital audio 
and video using a standard compact disc with a 
bit-rate of 1.416 Mbps (1.15 Mbps of this is for 
video) . 

MPEG-1 is an ISO standard (ISO/IEC 
11172), and consists of six parts: 



system 

video 

audio 

low bit-rate audio 
conformance testing 
simulation software 



ISO/IEC 11172-1 
ISO/IEC 11172-2 
ISO/IEC 11172-3 
ISO/IEC 13818-3 
ISO/IEC 11172-4 
ISO/IEC 11172-5 



The bitstreams implicitly define the 
decompression algorithms. The compression 
algorithms are up to the individual manufactur- 
ers, allowing a proprietary advantage to be 
obtained within the scope of an international 
standard. 



MPEG vs. JPEG 

JPEG (ISO/IEC 10918) was designed for 
still continuous-tone grayscale and color 
images. It doesn’t handle bi-level (black and 
white) images efficiently, and pseudo-color 
images have to be expanded into the 
unmapped color representation prior to pro- 
cessing. JPEG images may be of any resolution 
and color space, with both lossy and lossless 
algorithms available. 

Since JPEG is such a general purpose stan- 
dard, it has many features and capabilities. By 
adjusting the various parameters, compressed 
image size can be traded against reconstructed 
image quality over a wide range. Image quality 
ranges from “browsing” (100:1 compression 
ratio) to “indistinguishable from the source” 
(about 3:1 compression ratio). Typically, the 
threshold of visible difference between the 
source and reconstructed images is some- 
where between a 10:1 to 20:1 compression ratio. 

JPEG does not use a single algorithm, but 
rather a family of four, each designed for a cer- 
tain application. The most familiar lossy algo- 
rithm is sequential DCT. Either Huffman 
encoding (baseline JPEG) or arithmetic encod- 
ing may be used. When the image is decoded, 
it is decoded left-to-right, top-to-bottom. 
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Progressive DCT is another lossy algo- 
rithm, requiring multiple scans of the image. 
When the image is decoded, a coarse approxi- 
mation of the full image is available right away, 
with the quality progressively improving until 
complete. This makes it ideal for applications 
such as image database browsing. Either spec- 
tral selection, successive approximation, or 
both may be used. The spectral selection 
option encodes the lower-frequency DCT coef- 
ficients first (to obtain an image quickly), fol- 
lowed by the higher-frequency ones (to add 
more detail). The successive approximation 
option encodes the more significant bits of the 
DCT coefficients first, followed by the less sig- 
nificant bits. 

The hierarchical mode represents an 
image at multiple resolutions. For example, 
there could be 512 x 512, 1024 x 1024, and 
2048 x 2048 versions of the image. Higher- 
resolution images are coded as differences 
from the next smaller image, requiring fewer 
bits than they would if stored independently. 
Of course, the total number of bits is greater 
than that needed to store just the highest-res- 
olution image. Note that the individual images 
in a hierarchical sequence may be coded pro- 
gressively if desired. 

Also supported is a lossless spatial algo- 
rithm that operates in the pixel domain as 
opposed to the transform domain. A prediction 
is made of a sample value using up to three 
neighboring samples. This prediction then is 
subtracted from the actual value and the differ- 
ence is losslessly coded using either Huffman 
or arithmetic coding. Lossless operation 
achieves about a 2:1 compression ratio. 

Since video is just a series of still images, 
and baseline JPEG encoders and decoders 
were readily available, people used baseline 
JPEG to compress real-time video (also called 



motion JPEG or MJPEG) . However, this tech- 
nique does not take advantage of the frame-to- 
frame redundancies to improve compression, 
as does MPEG. 

Perhaps most important, JPEG is symmet- 
rical, meaning the cost of encoding and decod- 
ing is roughly the same. MPEG, on the other 
hand, was designed primarily for mastering a 
video once and playing it back many times on 
many platforms. To minimize the cost of MPEG 
hardware decoders, MPEG was designed to be 
asymmetrical, with the encoding process 
requiring about 100x the computing power of 
the decoding process. 

Since MPEG is targeted for specific applica- 
tions, the hardware usually supports only a few 
specific resolutions. Also, only one color space 
(YCbCr) is supported using 8-bit samples. 
MPEG is also optimized for a limited range of 
compression ratios. 

If capturing video for editing, you can use 
either baseline JPEG or I-frame-only (intra- 
frame) MPEG to compress to disc in real-time. 
Using JPEG requires that the system be able 
to transfer data and access the hard disk at bit- 
rates of about 4 Mbps for SIF (Standard Input 
Format) resolution. Once the editing is done, 
the result can be converted into MPEG for 
maximum compression. 



Quality Issues 

At bit-rates of about 3-4 Mbps, “broadcast 
quality” is achievable with MPEG-1. However, 
sequences with complex spatial-temporal activ- 
ity (such as sports) may require up to 5-6 
Mbps due to the frame-based processing of 
MPEG-1. MPEG-2 allows similar “broadcast 
quality” at bit-rates of about 4-6 Mbps by sup- 
porting field-based processing. 




Audio Overview 541 



Several factors affect the quality of MPEG- 
compressed video: 

• the resolution of the original video source 

• the bit-rate (channel bandwidth) allowed after 
compression 

• motion estimator effectiveness 

One limitation of the quality of the com- 
pressed video is determined by the resolution 
of the original video source. If the original res- 
olution was too low, there will be a general lack 
of detail. 

Motion estimator effectiveness determines 
motion artifacts, such as a reduction in video 
quality when movement starts or when the 
amount of movement is above a certain thresh- 
old. Poor motion estimation will contribute to a 
general degradation of video quality. 

Most importantly, the higher the bit-rate 
(channel bandwidth), the more information 
that can be transmitted, allowing fewer motion 
artifacts to be present or a higher-resolution 
image to be displayed. Generally speaking, 
decreasing the bit-rate does not result in a 
graceful degradation of the decoded video 
quality. The video quality rapidly degrades, 
with the 8x8 blocks becoming clearly visible 
once the bit-rate drops below a given thresh- 
old. 



Audio Overview 

MPEG-l uses a family of three audio cod- 
ing schemes, called Layer I, Layer II, and 
Layer III, with increasing complexity and 
sound quality. The three layers are hierarchi- 
cal: a Layer III decoder handles Layers I, II, 
and III; a Layer II decoder handles only Layers 
I and II; a Layer I decoder handles only Layer I. 
All layers support 16-bit audio using 16, 22.05, 
24, 32, 44.1, or 48 kHz sampling rates. 



Lor each layer, the bitstream format and 
the decoder are specified. The encoder is not 
specified, to allow for future improvements. All 
layers work with similar bit-rates: 

Layer I: 32-448 kbps 

Layer II: 8-384 kbps 

Layer III: 8-320 kbps 

Two audio channels are supported with 
four modes of operation: 

normal stereo 

joint (intensity and/or ms) stereo 
dual channel mono 
single channel mono 

Lor normal stereo, one channel carries the left 
audio signal and one channel carries the right 
audio signal. Lor intensity stereo (supported 
by all layers), high frequencies (above 2 kHz) 
are combined. The stereo image is preserved 
but only the temporal envelope is transmitted. 
Lor ms stereo (supported by Layer III only), 
one channel carries the sum signal (L+R) and 
the other the difference (L-R) signal. In addi- 
tion, pre-emphasis, copyright marks, and origi- 
nal/ copy indication are supported. 

Sound Quality 

To determine which layer should be used 
for a specific application, look at the available 
bit-rate, as each layer was designed to support 
certain bit-rates with a minimum degradation 
of sound quality. 

Layer I, a simplified version of Layer 2, has 
a target bit-rate 192 kbps per channel or 
higher. 

Layer II is identical to MUSICAM, and has 
a target bit-rate 128 kbps per channel. It was 
designed as a trade-off between sound quality 
and encoder complexity. It is most useful for 
bit-rates around 96-128 kbps per channel. 
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Layer III (also known as mp3) merges the 
best ideas of MUSICAM and ASPEC and has a 
target bit-rate of about 64 kbps per channel. 
The Layer III format specifies a set of 
advanced features that all address a single 
goal: to preserve as much sound quality as pos- 
sible, even at relatively low bit-rates. 

Background Theory 

All layers use a coding scheme based on 
psychoacoustic principles — in particular, 

masking effects where, for example, a loud 
tone at one frequency prevents another, qui- 
eter, tone at a nearby frequency from being 
heard. 

Suppose you have a strong tone with a fre- 
quency of 1000 Hz, and a second tone at 1100 
Hz that is 18 dB lower in intensity. The 1100 Hz 
tone will not be heard; it is masked by the 1000 
Hz tone. However, a tone at 2000 Hz 18 dB 
below the 1000 Hz tone will be heard. In order 
to have the 1000 Hz tone mask it, the 2000 Hz 
tone will have to be about 45 dB down. Any rel- 
atively weak frequency near a strong fre- 
quency is masked; the further you get from a 
frequency, the smaller the masking effect. 

Curves have been developed that plot the 
relative energy versus frequency that is 
masked (concurrent masking). Masking 
effects also occur before (premasking) and 
after (postmasking) a strong frequency if there 
is a significant (30-40 dB) shift in level. The 
reason is believed to be that the brain needs 
processing time. Premasking time is about 2-5 
ms; postmasking can last up to 100 ms. 

Adjusting the noise floor reduces the 
amount of needed data, enabling further com- 
pression. CDs use 16 bits of resolution to 
achieve a signal-to-noise ratio (SNR) of about 
96 dB, which just happens to match the 
dynamic range of hearing pretty well (meaning 
most people will not hear noise during 



silence). If 8-bit resolution were used, there 
would be a noticeable noise during silent 
moments in the music or between words. How- 
ever, noise isn’t noticed during loud passages 
due to the masking effect, which means that 
around a strong sound you can raise the noise 
floor since the noise will be masked anyway. 

For a stereo signal, there usually is redun- 
dancy between channels. All layers may exploit 
these stereo effects by using a joint stereo 
mode, with the most flexible approach being 
used by Layer III. 



Video Coding Layer 

MPEG-l permits resolutions up to 4095 x 
4095 at 60 frames per second (progressive 
scan) . What many people think of as MPEG-1 
is a subset known as Constrained Parameters 
Bitstream (CPB). The CPB is a limited set of 
sampling and bit-rate parameters designed to 
standardize buffer sizes and memory band- 
widths, allowing a nominal guarantee of 
interoperability for decoders and encoders, 
while still addressing the widest possible range 
of applications. Devices not capable of han- 
dling these are not considered to be true 
MPEG-1. Table 12.1 lists some of the con- 
strained parameters. 

The CPB limits video to 396 macroblocks 
(101,376 pixels). Therefore, MPEG-1 video is 
typically coded at SIF resolutions of 352 x 240p 
or 352 x 288p. During encoding, the original 
BT.601 resolution of 704 x 480i or 704 x 576i is 
scaled down to SIF resolution. This is usually 
done by ignoring Field 2 and scaling down 
Field 1 horizontally. During decoding, the SIF 
resolution is scaled up to the 704 x 480i or 704 
x 576i resolution. Note that some entire active 
scan lines and samples on a scan line are 
ignored to ensure the number of Y samples 
can be evenly divided by 16. Table 12.2 lists 
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some of the more common MPEG-1 resolu- 
tions. 

The coded video rate is limited to 1.856 
Mbps. However, the bit-rate is the most often 
waived parameter, with some applications 
using up to 6 Mbps or higher. 

MPEG-1 video data uses the 4:2:0 YCbCr 
format shown in Figure 3.7. 

Interlaced Video 

MPEG-1 was designed to handle progres- 
sive (also referred to as noninterlaced) video. 
Early on, in an effort to improve video quality, 
several schemes were devised to enable the 
use of both fields of an interlaced picture. 

For example, both fields can be combined 
into a single frame of 704 x 480p or 704 x 576p 
resolution and encoded. During decoding, the 
fields are separated. This, however, results in 
motion artifacts due to a moving object being 
in slightly different places in the two fields. 
Coding the two fields separately avoids motion 
artifacts, but reduces the compression ratio 
since the redundancy between fields isn’t used. 

There were many other schemes for han- 
dling interlaced video, so MPEG-2 defined a 
standard way of handling it (covered in Chap- 
ter 13). 

Encode Preprocessing 

Better images can be obtained by prepro- 
cessing the video stream prior to MPEG 
encoding. 

To avoid serious artifacts during encoding 
of a particular picture, prefiltering can be 
applied over the entire picture or just in spe- 
cific problem areas. Prefiltering before com- 
pression processing is analogous to anti-alias 



filtering prior to A/D conversion. Prefiltering 
may take into account texture patterns, 
motion, and edges, and may be applied at the 
picture, slice, macroblock, or block level. 

MPEG encoding works best on scenes 
with little fast or random movement and good 
lighting. For best results, foreground lighting 
should be clear and background lighting dif- 
fused. Foreground contrast and detail should 
be normal, but low contrast backgrounds con- 
taining soft edges are preferred. Editing tools 
typically allow you to preprocess potential 
problem areas. 

The MPEG-1 specification has example fil- 
ters for scaling down from BT.601 to SIF reso- 
lution. In this instance, field 2 is ignored, 
throwing away half the vertical resolution, and 
a decimation filter is used to reduce the hori- 
zontal resolution of the remaining scan lines by 
a factor of two. Appropriate decimation of the 
Cb and Cr components must still be carried 
out. 

Better video quality may be obtained by 
deinterlacing prior to scaling down to SIF reso- 
lution. When working on macroblocks (de- 
fined later), if the difference between macro- 
blocks between two fields is small, average 
both to generate a new macroblock. Other- 
wise, use the macroblock area from the field of 
the same parity to avoid motion artifacts. 

Coded Frame Types 

There are four types of coded frames. I 
(intra) frames (~1 bit/pixel) are frames coded 
as a stand-alone still image. They allow random 
access points within the video stream. As such, 
I frames should occur about two times a sec- 
ond. I frames should also be used where scene 
cuts occur. 
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horizontal resolution 


<768 samples 


vertical resolution 


< 576 scan lines 


picture area 


< 396 macroblocks 


pel rate 


< 396 x 25 macroblocks per second 


picture rate 


< 30 frames per second 


bit-rate 


< 1.856 Mbps 



Table 12.1. Some of the Constrained Parameters for MPEG-1. 



Resolution 


Frames per Second 


352 x 240p 


29.97 


352 x 240p 


23.976 


352 x 288p 


25 


320 x 240P 1 


29.97 


384 x 288P 1 


25 



Notes: 

1. Square pixel format. 



Table 12.2. Common MPEG-1 Resolutions. 



P (predicted) frames (—0.1 bit/ pixel) are 
coded relative to the nearest previous I or P 
frame, resulting in forward prediction pro- 
cessing, as shown in Figure 12.1. P frames 
provide more compression than I frames, 
through the use of motion compensation, and 
are also a reference for B frames and future P 
frames. 

B (bi-directional) frames (-0.015 bit/pixel) 
use the closest past and future I or P frame as a 
reference, resulting in bi-directional predic- 
tion, as shown in Figure 12.1. B frames provide 
the most compression and decrease noise by 
averaging two frames. Typically, there are two 
B frames separating I or P frames. 



D (DC) frames are frames coded as a 
stand-alone still image, using only the DC com- 
ponent of the DCTs. D frames may not be in a 
sequence containing any other frame types 
and are rarely used. 

A group of pictures (GOP) is a series of 
one or more coded frames intended to assist in 
random accessing and editing. The GOP value 
is configurable during the encoding process. 
The smaller the GOP value, the better the 
response to movement (since the I frames are 
closer together), but the lower the compres- 
sion. 
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FORWARD 

PREDICTION 





FRAME DISPLAY 
ORDER 



4 5 6 7 



FRAME TRANSMIT 
ORDER 




BI-DIRECTIONAL 

PREDICTION 



□ 

□ 

■ 



INTRA (I) FRAME 
BI-DIRECTIONAL (B) FRAME 
PREDICTED (P) FRAME 



Figure 12.1. MPEG-1 I, P, and B Frames. Some frames are transmitted out of display sequence, 
complicating the interpolation process, and requiring frame reordering by the MPEG decoder. 
Arrows show inter-frame dependencies. 



In the coded bitstream, a GOP must start 
with an I frame and may be followed by any 
number of I, P, or B frames in any order. In dis- 
play order, a GOP must start with an I or B 
frame and end with an I or P frame. Thus, the 
smallest GOP size is a single I frame, with the 
largest size unlimited. 

Originally, each GOP was to be coded and 
displayed independently of any other GOP. 
However, this is not possible unless no B 
frames precede I frames, or if they do, they use 
only backward motion compensation. This 
results in both open and closed GOP formats. 
A closed GOP is a GOP that can be decoded 
without using frames of the previous GOP for 
motion compensation. An open GOP requires 
that they be available. 



Motion Compensation 

Motion compensation improves compres- 
sion of P and B frames by removing temporal 
redundancies between frames. It works at the 
macroblock level (defined later) . 

The technique relies on the fact that within 
a short sequence of the same general image, 
most objects remain in the same location, 
while others move only a short distance. The 
motion is described as a two-dimensional 
motion vector that specifies where to retrieve a 
macroblock from a previously decoded frame 
to predict the sample values of the current 
macroblock. 
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After a macroblock has been compressed 
using motion compensation, it contains both 
the spatial difference (motion vectors) and 
content difference (error terms) between the 
reference macroblock and macroblock being 
coded. 

Note that there are cases where informa- 
tion in a scene cannot be predicted from the 
previous scene, such as when a door opens. 
The previous scene doesn’t contain the details 
of the area behind the door. In cases such as 
this, when a macroblock in a P frame cannot be 
represented by motion compensation, it is 
coded the same way as a macroblock in an I 
frame (using intra-picture coding) . 

Macroblocks in B frames are coded using 
either the closest previous or future I or P 
frames as a reference, resulting in four possi- 
ble codings: 

• intra-coding 

no motion compensation 

• forward prediction 

closest previous I or P frame is the 
reference 

• backward prediction 

closest future I or P frame is the 
reference 

• bi-directional prediction 

two frames are used as the reference: 
the closest previous I or P frame and 
the closest future I or P frame 

Backward prediction is used to predict uncov- 
ered areas that appear in previous frames. 

I Frames 

Image blocks and prediction error blocks 
have a high spatial redundancy. Several steps 
are used to remove this redundancy within a 



frame to improve the compression. The 
inverse of these steps is used by the decoder to 
recover the data. 

Macroblock 

A macroblock (shown in Figure 7.55) con- 
sists of a 16-sample x 16-line set of Y compo- 
nents and the corresponding two 8-sample x 8- 
line Cb and Cr components. 

A block is an 8-sample x 8-line set of Y, Cb, 
or Cr values. Note that a Y block refers to one- 
fourth the image size as the corresponding Cb 
or Cr blocks. Thus, a macroblock contains four 
Y blocks, one Cb block, and one Cr block, as 
seen in Figure 12.2. 

There are two types of macroblocks in I 
frames, both using intra-coding, as shown in 
Table 12.9. One (called intra-d) uses the cur- 
rent quantizer scale; the other (called intra-q) 
defines a new value for the quantizer scale 

If the macroblock type is intra-q, the mac- 
roblock header specifies a 5-bit quantizer scale 
factor. The decoder uses this to calculate the 
DCT coefficients from the transmitted quan- 
tized coefficients. Quantizer scale factors may 
range from 1-31, with zero not allowed. 

If the macroblock type is intra-d, no quan- 
tizer scale is sent, and the decoder uses the 
current one. 

DCT 

Each 8x8 block (of input samples or pre- 
diction error terms) is processed by an 8 x 8 
DCT (discrete cosine transform), resulting in 
an 8 x 8 block of horizontal and vertical fre- 
quency coefficients, as shown in Figure 7.56. 

Input sample values are 0-255, resulting in 
a range of 0-2040 for the DC coefficient and a 
range of about -1000 to +1000 for the AC coef- 
ficients. 
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EACH MACROBLOCK IS 
16 SAMPLES BY 16 LINES 
(4 Y BLOCKS) 





BLOCK ARRANGEMENT 
WITHIN A MACROBLOCK 

Cr| BLOCK 5 

CB | BLOCK 4 



BLOCK 0 


BLOCK 1 


BLOCK 2 


BLOCK 3 



EACH Y BLOCK IS 8 
SAMPLES BY 8 LINES 



Figure 12.2. MPEG-1 Macroblocks and Blocks. 



Quantizing 

The 8x8 block of frequency coefficients is 
uniformly quantized, limiting the number of 
allowed values. The quantizer step scale is 
derived from the quantization matrix and quan- 
tizer scale and may be different for different 
coefficients and may change between macrob- 
locks. 

The quantizer step size of the DC coeffi- 
cients is fixed at eight. The DC quantized coef- 
ficient is determined by dividing the DC 
coefficient by eight and rounding to the near- 
est integer. AC coefficients are quantized using 
the intra-quantization matrix. 

Zig-Zag Scan 

Zig-zag scanning, starting with the DC 
component, generates a linear stream of quan- 
tized frequency coefficients arranged in order 
of increasing frequency, as shown in Figure 
7.59. This produces long runs of zero coeffi- 
cients. 



Coding of Quantized DC Coefficients 

After the DC coefficients have been quan- 
tized, they are losslessly coded. 

Coding of Y blocks within a macroblock fol- 
lows the order shown in Figure 12.2. The DC 
value of block 4 is the DC predictor for block 1 
of the next macroblock. At the beginning of 
each slice, the DC predictor is set to 1024. 

The DC values of each Cb and Cr block are 
coded using the DC value of the correspond- 
ing block of the previous macroblock as a pre- 
dictor. At the beginning of each slice, both DC 
predictors are set to 1024. 

The DCT DC differential values are orga- 
nized by their absolute value as shown in Table 
12.16. [size], which specifies the number of 
additional bits to define the level uniquely, is 
transmitted by a variable-length code, and is 
different for Y and CbCr since the statistics are 
different. For example, a size of four is fol- 
lowed by four additional bits. 

The decoder reverses the procedure to 
recover the quantized DC coefficients. 
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Coding of Quantized AC Coefficients 

After the AC coefficients have been quan- 
tized, they are scanned in the zig-zag order 
shown in Figure 7.59 and coded using run- 
length and level. The scan starts in position 1, 
as shown in Figure 7.59, as the DC coefficient 
in position 0 is coded separately. 

The run-lengths and levels are coded as 
shown in Table 12.18. The “s” bit denotes the 
sign of the level; “0” is positive and “1” is nega- 
tive. 

For run-level combinations not shown in 
Table 12.18, an escape sequence is used, con- 
sisting of the escape code (ESC) , followed by 
the run-length and level codes from Table 
12.19. 

After the last DCT coefficient has been 
coded, an EOB code is added to tell the 
decoder that there are no more quantized coef- 
ficients in this 8x8 block. 

P Frames 

Macroblocks 

There are eight types of macroblocks in P 
frames, as shown in Table 12.10, due to the 
additional complexity of motion compensation. 

Skipped macroblocks are predicted mac- 
roblocks with a zero motion vector. Thus, no 
correction is available; the decoder copies 
skipped macroblocks from the previous frame 
into the current frame. The advantage of 
skipped macroblocks is that they require very 
few bits to transmit. They have no code; they 
are coded by having the macroblock address 
increment code skip over them. 

If the [macroblock quant] column in Table 

12.10 has a “1,” the quantizer scale is transmit- 
ted. For the remaining macroblock types, the 



DCT correction is coded using the previous 
value for quantizer scale. 

If the [motion forward] column in Table 

12.10 has a “1,” horizontal and vertical forward 
motion vectors are successively transmitted. 

If the [coded pattern] column in Table 

12.10 has a “1,” the 6-bit coded block pattern is 
transmitted as a variable-length code. This tells 
the decoder which of the six blocks in the mac- 
roblock are coded (“1”) and which are not 
coded (“0”). Table 12.14 lists the codewords 
assigned to the 63 possible combinations. 
There is no code for when none of the blocks is 
coded; it is indicated by the macroblock type. 
For macroblocks in I frames and for intra- 
coded macroblocks in P and B frames, the 
coded block pattern is not transmitted, but is 
assumed to be a value of 63 (all blocks are 
coded). 

To determine which type of macroblock to 
use, the encoder typically makes a series of 
decisions, as shown in Figure 12.3. 

DCT 

Intra-block AC coefficients are trans- 
formed in the same manner as they are for I 
frames. Intra-block DC coefficients are trans- 
formed differently; the predicted values are set 
to 1024, unless the previous block was intra 
coded. 

Non-intra-block coefficients represent dif- 
ferences between sample values rather than 
actual sample values. They are obtained by 
subtracting the motion-compensated values of 
the previous frame from the values in the cur- 
rent macroblock. There is no prediction of the 
DC value. 

Input sample values are -255 to +255, 
resulting in a range of about -2000 to +2000 for 
the AC coefficients. 
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MOTION 

COMPENSATION 



NO MOTION 
COMPENSATION 



CODED 


NO QUANT 


PRED-MC 




NOT CODED 




PRED-M 




QUANT 


INTRA-Q 




NO QUANT 


INTRA-D 




QUANT 


PRED-CQ 


CODED 








NO QUANT 


PRED-C 


NOT CODED 




SKIPPED 




QUANT 


INTRA-Q 




NO QUANT 


INTRA-D 



Figure 12.3. MPEG-1 P Frame Macroblock Type Selection. 



Quantizing 

Intra-blocks are quantized in the same 
manner as they are for I frames. 

Non-intra-blocks are quantized using the 
quantizer scale and the non-intra quantization 
matrix. The AC and DC coefficients are quan- 
tized in the same manner. 

Coding of Intra-Blocks 

Intra-blocks are coded the same way as I 
frame intra blocks. There is a difference in the 
handling of the DC coefficients in that the pre- 
dicted value is 128, unless the previous block 
was intra coded. 

Coding of Non-Intra-Blocks 

The coded block pattern (CBP) is used to 
specify which blocks have coefficient data. 
These are coded similarly to the coding of intra 
blocks, except the DC coefficient is coded in 
the same manner as the AC coefficients. 



B Frames 

Macroblocks 

There are twelve types of macroblocks in 
B frames, as shown in Table 12.11, due to the 
additional complexity of backward motion 
compensation. 

Skipped macroblocks are macroblocks 
having the same motion vector and macro- 
block type as the previous macroblock, which 
cannot be intra coded. The advantage of 
skipped macroblocks is that they require very 
few bits to transmit. They have no code; they 
are coded by having the macroblock address 
increment code skip over them. 

If the [macroblock quant] column in Table 
12.11 has a “1,” the quantizer scale is transmit- 
ted. For the rest of the macroblock types, the 
DCT correction is coded using the previous 
value for the quantizer scale. 
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FORWARD MC 





BACKWARD MC 




INTERPOLATED MC 





QUANT 


PRED-*CQ 


CODED 








NO QUANT 


PRED-*C 


NOT CODED 


PRED-* 


OR SKIPPED 




QUANT 


INTRA-Q 




NO QUANT 


INTRA-D 



Figure 12.4. MPEG-1 B Frame Macroblock Type Selection. 



If the [motion forward] column in Table 
12.11 has a “1,” horizontal and vertical forward 
motion vectors are successively transmitted. If 
the [motion backward] column in Table 12.11 
has a “1,” horizontal and vertical backward 
motion vectors are successively transmitted. If 
both forward and backward motion types are 
present, the vectors are transmitted in this 
order: 



coded; this is indicated by the macroblock 
type. For macroblocks in I frames and for intra- 
coded macroblocks in P and B frames, the 
coded block pattern is not transmitted, but is 
assumed to be a value of 63 (all blocks are 
coded). 

To determine which type of macroblock to 
use, the encoder typically makes a series of 
decisions, shown in Figure 12.4. 



horizontal forward 
vertical forward 
horizontal backward 
vertical backward 



If the [coded pattern] column in Table 
12.11 has a “1,” the 6-bit coded block pattern is 
transmitted as a variable-length code. This tells 
the decoder which of the six blocks in the mac- 
roblock are coded (“1”) and which are not 
coded (“0”). Table 12.14 lists the codewords 
assigned to the 63 possible combinations. 
There is no code for when none of the blocks is 



Coding 

DCT coefficients of blocks are trans- 
formed into quantized coefficients and coded 
in the same way they are for P frames. 

D Frames 

D frames contain only DC-frequency data 
and are intended to be used for fast visible 
search applications. The data contained in a D 
frame should be just sufficient for the user to 
locate the desired video. 
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Video Bitstream 

Figure 12.5 illustrates the video bitstream, 
a hierarchical structure with seven layers. 
From top to bottom the layers are: 

Video Sequence 
Sequence Header 
Group of Pictures (GOP) 

Picture 

Slice 

Macroblock (MB) 

Block 

Note that start codes (OxOOOOOlxx) must 
be byte aligned by inserting 0-7 “0” bits before 
the start code. 

Video Sequence 

Sequence_end_code 

This 32-bit field has a value of 0x000001B7 
and terminates a video sequence. 

Sequence Header 

Data for each sequence consists of a 
sequence header followed by data for group of 
pictures (GOPs). The structure is shown in 
Figure 12.5. 

Sequence_header_code 

This 32-bit field has a value of 0x000001 B3 
and indicates the beginning of a sequence 
header. 

Horizontal_size 

This 12-bit binary value specifies the width 
of the viewable portion of the Y component. 
The width in macroblocks is defined as 
(horizontal _size + 15) /16. 



Vertical_size 

This 12-bit binary value specifies the 
height of the viewable portion of the Y compo- 
nent. The height in macroblocks is defined as 
(i vertical_size + 15) /16. 

Pel_aspect_ratio 

This Tbit codeword indicates the pixel 
aspect ratio, as shown in Table 12.3. 

Picture_rate 

This Tbit codeword indicates the frame 
rate, as shown in Table 12.4. 

Bit_rate 

An 18-bit binary value specifying the bit- 
stream bit-rate, measured in units of 400 bps 
rounded upwards. A zero value is not allowed; 
a value of 0x3FFFF specifies variable bit-rate 
operation. If constrained parameters _Jlag is a 
“1,” the bit-rate must be <1.856 Mbps. 

Marker_bit 

Always a “1.” 

Vbv_buffer_size 

This 10-bit binary number specifies the 
minimum size of the video buffering verifier 
needed by the decoder to properly decode the 
sequence. It is defined as: 

B = 16 x 1024 x vbv_buffer_size 

If the constrained parameters Pag bit is a “1,” 
the vbv Puffer _size must be <40 kB. 
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Figure 12.5. MPEG-1 Video Bitstream Layer Structures. Marker and reserved bits not shown. 
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Height / Width 


Example 


Aspect Ratio Code 


forbidden 




0000 


1.0000 


square pixel 


0001 


0.6735 




0010 


0.7031 


576-line 16:9 


0011 


0.7615 




0100 


0.8055 




0101 


0.8437 


480-line 16:9 


0110 


0.8935 




0111 


0.9157 


576-line 4:3 


1000 


0.9815 




1001 


1.0255 




1010 


1.0695 




1011 


1.0950 


480-line 4:3 


1100 


1.1575 




1101 


1.2015 




1110 


reserved 




1111 



Table 12.3. MPEG-1 pel_aspect_ratio Codewords. 



Frames 
per Second 


Picture 
Rate Code 


forbidden 


0000 


24/1.001 


0001 


24 


0010 


25 


0011 


30/1.001 


0100 


30 


0101 


50 


0110 


60/1.001 


0111 


60 


1000 


reserved 


1001 


reserved 


1010 


reserved 


1011 


reserved 


1100 


reserved 


1101 


reserved 


1110 


reserved 


1111 



Table 12.4. MPEG-1 picture_rate Codewords. 






554 Chapter 12: MPEG-1 



Constrained_parameters_flag 

This bit is set to a “1” if the following con- 
straints are met: 

horizontal_size < 768 samples 
vertical_size <576 lines 

({horizontal _size + 15)/16) x ((vertical size + 
15)/16) <396 

((horizontal size + 15)/16) x ((vertical size + 
15)/16) x pidurejrate < 396*25 
picturejrate < 30 frames per second 
forwardJ_code < 4 
backwardJ_code < 4 

Load_intra_quantizer_matrix 

This bit is set to a “1” if 
intra_quantizer_matrix follows. If set to a “0,” 
the default values below are used until the next 
occurrence of a sequence header. 



8 


16 


19 


22 


26 


27 


29 


34 


16 


16 


22 


24 


27 


29 


34 


37 


19 


22 


26 


27 


29 


34 


34 


38 


22 


22 


26 


27 


29 


34 


37 


40 


22 


26 


27 


29 


32 


35 


40 


48 


26 


27 


29 


32 


35 


40 


48 


58 


26 


27 


29 


34 


38 


46 


56 


69 


27 


29 


35 


38 


46 


56 


69 


83 



Intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the current intra quantizer values. A 
value of zero is not allowed. The value for 
intra_quant [0, 0] is always 8. These values 
take effect until the next occurrence of a 
sequence header. 

Load_non_intra_quantizer_matrix 

This bit is set to a “1” if 
n o n_in tra_q ua n tizerjmatrix follows. If set to a 
“0,” the default values below are used until the 
next occurrence of a sequence header. 



16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 



N on_intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the current non-intra quantizer values. 
A value of zero is not allowed. These values 
take effect until the next occurrence of a 
sequence header. 



Extension_start_code 

This optional 32-bit string of 0x000001B5 
indicates the beginning of 

sequence _extension _data. 
sequence -extension _data continues until the 
detection of another start code. 

Sequence_extension_data 

These n x 8 bits are present only if 
extension_start_code is present. 



User_data_start_code 

This optional 32-bit string of 0x000001B2 
indicates the beginning of user_data. user_data 
continues until the detection of another start 
code. 

User_data 

These n x 8 bits are present only if 
user_data_start_code is present. user_data 
must not contain a string of 23 or more consec- 
utive zero bits. 
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Timecode 


Range 
of Vaiue 


Number of 
Bits 


drop frame flag 




1 


time_code_hours 


0-23 


5 


time_code_minutes 


0-59 


6 


marker bit 


1 


1 


time_code_seconds 


0-59 


6 


time_code_pictures 


0-59 


6 



Table 12.5. MPEG-1 time_code Field. 



Group of Pictures (GOP) Layer 

Data for each group of pictures consists of 
a GOP header followed by picture data. The 
structure is shown in Figure 12.5. 

Group_start_code 

This 32-bit value of 0x000001B8 indicates 
the beginning of a group of pictures. 

Time_code 

These 25 bits indicate timecode informa- 
tion, as shown in Table 12.5. [drop_frame_flag] 
may be set to “1” only if the picture rate is 30/ 
1.001 (29.97) Hz. 

Closed_gop 

This l-bit flag is set to “1” if the group of 
pictures has been encoded without motion vec- 
tors referencing the previous group of pic- 
tures. This bit allows support of editing the 
compressed bitstream. 

Brokenlink 

This 1-bit flag is set to a “0” during encod- 
ing. It is set to a “1” during editing when the B 
frames following the first I frame of a group of 
pictures cannot be correctly decoded. 



Extension_start_code 

This optional 32-bit string of 0x000001B5 
indicates the beginning of 

group _extension jdata. group _extension jdata 

continues until the detection of another start 
code. 

Group_extension_data 

These n x 8 bits are present only if 
extension_startjcode is present. 



User_data_start_code 

This optional 32-bit string of 0x000001B2 
indicates the beginning of user jdata. user jdata 
continues until the detection of another start 
code. 

User_data 

These n x 8 bits are present only if 
user_data_start_code is present, userjdata 
must not contain a string of 23 or more consec- 
utive zero bits. 
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Picture Layer 

Data for each picture layer consists of a 
picture header followed by slice data. The 
structure is shown in Figure 12.5. 

Picture_start_code 

This has a 32-bit value of 0x00000100. 

Temporal_reference 

For the first frame in display order of each 
group of pictures, the temporal_reference value 
is zero. This 10-bit binary value then incre- 
ments by one, modulo 1024 for each frame in 
display order. 

Picture_coding_type 

This 3-bit codeword indicates the frame 
type (I frame, P frame, B frame, or D frame) , as 
shown in Table 12.6. D frames are not to be 
used in the same video sequence as other 
frames. 



Coding 

Type 


Code 


forbidden 


000 


I frame 


001 


P frame 


010 


B frame 


Oil 


D frame 


100 


reserved 


101 


reserved 


110 


reserved 


111 



Table 12.6. MPEG-1 picture coding type 
Code. 



Vbv_delay 

For constant bit-rates, the 16-bit vbv_delay 
binary value sets the initial occupancy of the 
decoding buffer at the start of decoding a pic- 
ture so that it doesn’t overflow or underflow. 
For variable bit-rates, vbvjlelay has a value of 
OxFFFF. 



Full_pel_forward_vector 

This 1-bit flag is present if 
picture _coding_type is “010” (P frames) or 
“Oil” (B frames). If a “1,” the forward motion 
vectors are based on integer samples, rather 
than half-samples. 

Forward_f_code 

This 3-bit binary number is present if 
picture jcoding_type is “010” (P frames) or 
“Oil” (B frames). Values of “001” to “111” are 
used; a value of “000” is forbidden. 

Two parameters used by the decoder to 
decode the forward motion vectors are derived 
from this field: forward_r_size and forward J. 
forward_r_size is one less than forward '_f_code. 
forward J\ s defined in Table 12.7. 



Forward F 
Code 


Forward F 
Value 


OOl 


1 


OlO 


2 


Oil 


4 


100 


8 


101 


16 


110 


32 


111 


64 



Table 12.7. MPEG-1 forward_f_code 
Values. 
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Full_pel_backward_vector 

This l-bit flag is present if 
picture _coding_type is “Oil” (B frames). If a “1,” 
the backward motion vectors are based on 
integer samples, rather than half-samples. 

Backward_f_code 

This 3-bit binary number is present if 
pidure_coding_type is “Oil” (B frames). Values 
of “001” to “111” are used; a value of “000” is 
forbidden. 

Two parameters used by the decoder to 
decode the backward motion vectors are 
derived from this field: backward _r_size and 
backward J. backward_r_size is one less than 
backward Jjcode. backward J is defined the 
same as forward J. 



Extra_bit_picture 

A bit which, when set to “1,” indicates that 
extra JnformationJticture follows. 

Extra_information_picture 

If extra_bit Jpicture = “1,” then these 9 bits 
follow consisting of 8 bits of data 
( extra Jnfo rrn a tio n_p ictu re) and then another 
extra_bit _picture to indicate if a further 9 bits 
follow, and so on. 



Extension_start_code 

This optional 32-bit string of 0x00000 1B5 
indicates the beginning of 

picture jextension jdata. picture _extension_data 
continues until the detection of another start 
code. 

Picture_extension_data 

These n x 8 bits are present only if 
extension, _start_code is present. 



User_data_start_code 

This optional 32-bit string of 0x000001B2 
indicates the beginning of user jdata. user jdata 
continues until the detection of another start 
code. 

User_data 

These n x 8 bits are present only if 
user_data_start_code is present. User data 
must not contain a string of 23 or more consec- 
utive zero bits. 

Slice Layer 

Data for each slice layer consists of a slice 
header followed by macroblock data. The 
structure is shown in Figure 12.5. 

Slice_start_code 

The first 24 bits of this 32-bit field have a 
value of 0x000001. The last 8 bits are the 
slicejvertical position, and have a value of 
OxOl-OxAF. 

slicejvertical_position specifies the verti- 
cal position in macroblock units of the first 
macroblock in the slice. The value of the first 
row of macroblocks is one. 

Quantizer_scale 

This 5-bit binary number has a value of 1- 
31 (a value of 0 is forbidden). It specifies the 
scale factor of the reconstruction level of the 
DCT coefficients. The decoder uses this value 
until another quantizer jscale is received at 
either the slice or macroblock layer. 

Extra_bit_slice 

A bit which, when set to “1,” indicates that 
extra _information_slice follows. 
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Extra_information_slice 

If extra_bit_slice = “1,” then these 9 bits fol- 
low consisting of 8 bits of data 
(extra Jnformation_slice) and then another 
extra_bit_slice to indicate if a further 9 bits fol- 
low, and so on. 

Macroblock (MB) Layer 

Data for each macroblock layer consists of 
a macroblock header followed by motion vec- 
tors and block data. The structure is shown in 
Figure 12.5. 

Macroblock_stuffing 

This optional 11-bit field is a fixed bit string 
of “0000 0001 111” and may be used to increase 
the bit-rate to match the storage or transmis- 
sion requirements. Any number of consecutive 
macrobl()ck_stuffing fields may be used. 

Macroblock_escape 

This optional 11-bit field is a fixed bit string 
of “0000 0001 000” and is used when the differ- 
ence between the current macroblock address 
and the previous macroblock address is 
greater than 33. It forces the value of 
macroblock_address_increment to be increased 
by 33. Any number of consecutive 
macroblock_escape fields may be used. 

Macroblock_address_increment 

This is a variable-length codeword that 
specifies the difference between the current 
macroblock address and the previous macrob- 
lock address. It has a maximum value of 33. 
Values greater than 33 are encoded using the 
macroblock_escape field. The variable-length 
codes are listed in Table 12.8. 



Macroblock_type 

This is a variable-length codeword that 
specifies the coding method and macroblock 
content. The variable-length codes are listed in 
Tables 12.9 through 12.12. 

Quantizer_scale 

This optional 5-bit binary number has a 
value of 1-31 (a value of 0 is forbidden). It 
specifies the scale factor of the reconstruction 
level of the received DCT coefficients. The 
decoder uses this value until another 
quantizer _scale is received at either the slice or 
macroblock layer. The quantizer _scale field is 
present only when [macroblock quant] = “1” in 
Tables 12.9 through 12.12. 

Motion_horizontal_forward_code 

This optional variable-length codeword 
contains forward motion vector information as 
defined in Table 12.13. It is present only when 
[motion forward] = “1” in Tables 12.9 through 
12 . 12 . 

Motion_horizontal_forward_r 

This optional binary number (of 
forward _r_size bits) is used to help decode the 
forward motion vectors. It is present only 
when [motion forward] = “1” in Tables 12.9 
through 12.12, forwardJ_code -p “001,” and 
motion horizontal Jbrward_code ^ “0.” 

Motion_vertical_forward_code 

This optional variable-length codeword 
contains forward motion vector information as 
defined in Table 12.13. It is present only when 
[motion forward] = “1” in Tables 12.9 through 
12 . 12 . 
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Increment 

Value 


Code 


Increment 

Value 


Code 


1 


1 


17 


0000 0101 10 


2 


Oil 


18 


0000 0101 01 


3 


010 


19 


0000 0101 00 


4 


0011 


20 


0000 0100 11 


5 


0010 


21 


0000 0100 10 


6 


0001 1 


22 


0000 0100 Oil 


7 


00010 


23 


0000 0100 010 


8 


0000 111 


24 


0000 0100 001 


9 


0000 110 


25 


0000 0100 000 


10 


0000 1011 


26 


0000 0011 111 


11 


0000 1010 


27 


0000 0011 110 


12 


0000 1001 


28 


0000 0011 101 


13 


0000 1000 


29 


0000 0011 100 


14 


0000 0111 


30 


0000 0011 Oil 


15 


0000 0110 


31 


0000 0011 010 


16 


0000 0101 11 


32 


0000 0011 001 




33 


0000 0011 000 



Table 12.8. MPEG-1 Variable-Length Code Table for macroblock_address_increment. 



Macroblock 

Type 


Macroblock 

Quant 


Motion 

Forward 


Motion 

Backward 


Coded 

Pattern 


Intra 

Macroblock 


Code 


intra-d 


0 


0 


0 


0 


1 


1 


intra-q 


1 


0 


0 


0 


1 


01 



Table 12.9. MPEG-1 Variable-Length Code Table for macroblock_type for I Frames. 



Macroblock 

Type 


Macroblock 

Quant 


Motion 

Forward 


Motion 

Backward 


Coded 

Pattern 


Intra 

Macroblock 


Code 


pred-mc 


0 


1 


0 


1 


0 


1 


pred-c 


0 


0 


0 


1 


0 


01 


pred-m 


0 


1 


0 


0 


0 


001 


intra-d 


0 


0 


0 


0 


1 


0001 1 


pred-mcq 


1 


1 


0 


1 


0 


00010 


pred-cq 


1 


0 


0 


1 


0 


00001 


intra-q 


1 


0 


0 


0 


1 


0000 01 


skipped 





Table 12.10. MPEG-1 Variable-Length Code Table for macroblockjtype for P Frames 






560 Chapter 12: MPEG-1 



Macroblock 

Type 


Macroblock 

Quant 


Motion 

Forward 


Motion 

Backward 


Coded 

Pattern 


Intra 

Macroblock 


Code 


pred-i 


0 


1 


1 


0 


0 


10 


pred-ic 


0 


1 


1 


1 


0 


11 


pred-b 


0 


0 


1 


0 


0 


010 


intra-bc 


0 


0 


1 


1 


0 


Oil 


pred-f 


0 


1 


0 


0 


0 


0010 


pred-fc 


0 


1 


0 


1 


0 


0011 


intra-d 


0 


0 


0 


0 


1 


0001 1 


pred-icq 


1 


1 


1 


1 


0 


00010 


pred-fcq 


1 


1 


0 


1 


0 


0000 11 


pred-bcq 


1 


0 


1 


1 


0 


0000 10 


intra-q 


1 


0 


0 


0 


1 


0000 01 


skipped 





Table 12.11. MPEG-1 Variable-Length Code Table for macroblock_type for B Frames. 



Macroblock 

Quant 


Motion 

Forward 


Motion 

Backward 


Coded 

Pattern 


Intra 

Macrobiock 


Code 


0 


0 


0 


0 


1 


1 



Table 12.12. MPEG-1 Variable-Length Code Table for macroblock_type for D Frames. 
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Motion Vector 
Difference 


Code 


Motion Vector 
Difference 


Code 


-16 


0000 0011 001 


1 


010 


-15 


0000 0011 Oil 


2 


0010 


-14 


0000 0011 101 


3 


00010 


-13 


0000 0011 111 


4 


0000 110 


-12 


0000 0100 001 


5 


0000 1010 


-11 


0000 0100 Oil 


6 


0000 1000 


-10 


0000 0100 11 


7 


0000 0110 


-9 


0000 0101 01 


8 


0000 0101 10 


-8 


0000 0101 11 


9 


0000 0101 00 


-7 


0000 0111 


10 


0000 0100 10 


-6 


0000 1001 


11 


0000 0100 010 


-5 


0000 1011 


12 


0000 0100 000 


-4 


0000 111 


13 


0000 0011 110 


-3 


0001 1 


14 


0000 0011 100 


-2 


0011 


15 


0000 0011 010 


-1 


Oil 


16 


0000 0011 000 


0 


1 





Table 12.13. MPEG-1 Variable-Length Code Table for 
motion_horizontal_forward_code, motion_vertical_forward_code, 
motion_horizontal_backward_code, and motion_vertical_backward_code. 
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Motion_vertical_forward_r 

This optional binary number (of 
forward_r_size bits) is used to help decode the 
forward motion vectors. It is present only 
when [motion forward] = “1” in Tables 12.9 
through 12.12, forwardjjcode ^ “001,” and 
motion ^vertical _forward_code “0.” 

Motion_horizontal_backward_code 

This optional variable-length codeword 
contains backward motion vector information 
as defined in Table 12.13. It is present only 
when [motion backward] = “1” in Tables 12.9 
through 12.12. 

Motion_horizontal_backward_r 

This optional binary number (of 
backward _r_size bits) is used to help decode 
the backward motion vectors. It is present only 
when [motion backward] = “1” in Tables 12.9 
through 12.12, backward^fjcode ^ “001,” and 
motion_horizontal_backward_code ^ “0.” 

Motion_vertical_backward_code 

This optional variable-length codeword 
contains backward motion vector information 
as defined in Table 12.13. The decoded value 
helps decide if motion_vertical_backward_r 
appears in the bitstream. This parameter is 
present only when [motion backward] = “1” in 
Tables 12.9 through 12.12. 

Motion_vertical_backward_r 

This optional binary number (of 
backward _r_size bits) is used to help decode 
the backward motion vectors. It is present only 
when [motion backward] = “1” in Tables 12.9 
through 12.12, backward^code ^ “001,” and 
motion vertical backward code ^ “0.” 



Coded_block_pattern 

This optional variable-length codeword is 
used to derive the coded block pattern (CBP) 
as shown in Table 12.14. It is present only if 
[coded pattern] = “1” in Tables 12.9 through 
12.12, and indicates which blocks in the mac- 
roblock have at least one transform coefficient 
transmitted. The coded block pattern binary 
number is represented as: 

p o p i p 2 p 3 p 4 p 5 

where P n = “1” for any coefficient present for 
block [n], else P n = “0.” Block numbering (dec- 
imal format) is given in Figure 12.2. 

End_of_macroblock 

This optional l-bit field has a value of “1.” It 
is present only for D frames. 

Block Layer 

Data for each block layer consists of coeffi- 
cient data. The structure is shown in Figure 
12.5. 

Dct_dc_size_luminance 

This optional variable-length codeword is 
used with intra-coded Y blocks. It specifies the 
number of bits used for dct_dc_differential. The 
variable-length codewords are shown in Table 
12.15. 
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Coded 

Block 

Pattern 


Code 


Coded 

Block 

Pattern 


Code 


Coded 

Block 

Pattern 


Code 


60 


111 


9 


0010 110 


43 


0001 0000 


4 


1101 


17 


0010 101 


25 


0000 1111 


8 


1100 


33 


0010 100 


37 


0000 1110 


16 


1011 


6 


0010 Oil 


26 


0000 1101 


32 


1010 


10 


0010 010 


38 


0000 1100 


12 


1001 1 


18 


0010 001 


29 


0000 1011 


48 


10010 


34 


0010 000 


45 


0000 1010 


20 


10001 


7 


0001 1111 


53 


0000 1001 


40 


1000 0 


11 


0001 1110 


57 


0000 1000 


28 


0111 1 


19 


0001 1101 


30 


0000 0111 


44 


OHIO 


35 


0001 1100 


46 


0000 0110 


52 


01101 


13 


0001 1011 


54 


0000 0101 


56 


0110 0 


49 


0001 1010 


58 


0000 0100 


1 


0101 1 


21 


0001 1001 


31 


0000 0011 1 


61 


01010 


41 


0001 1000 


47 


0000 0011 0 


2 


01001 


14 


0001 0111 


55 


0000 0010 1 


62 


0100 0 


50 


0001 0110 


59 


0000 0010 0 


24 


0011 11 


22 


0001 0101 


27 


0000 0001 1 


36 


0011 10 


42 


0001 0100 


39 


0000 0001 0 


3 


0011 01 


15 


0001 0011 




63 


0011 00 


51 


0001 0010 


5 


0010 111 


23 


0001 0001 



Table 12.14. MPEG-1 Variable-Length Code Table for coded_block _pattern. 



DCT DC 
Size 

Luminance 


Code 


DCT DC 
Size 

Luminance 


Code 


O 


100 


5 


1110 


1 


00 


6 


11110 


2 


01 


7 


1111 10 


3 


101 


8 


1111 110 


4 


110 





Table 12.15. MPEG-1 Variable-Length Code Table for dct_dc_size_luminance. 
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D ct_dc_dif ferential 

This optional variable-length codeword is 
present after dct_dc_size_luminance if 
dct_dc_size_luminance ^ “0.” The values are 
shown in Table 12.16. 

Dct_dc_size_chrominance 

This optional variable-length codeword is 
used with intra-coded Cb and Cr blocks. It 
specifies the number of bits used for 
dctjdc -differential. The variable-length code- 
words are shown in Table 12.17. 

D ct_dc_dif ferential 

This optional variable-length codeword is 
present after dct_dc_size_chrominance if 
dct_dc_sizejchrominance ^ “0.” The values are 
shown in Table 12.16. 



Dct_coefficient_first 

This optional variable-length codeword is 
used for the first DCT coefficient in non-intra- 
coded blocks, and is defined in Tables 12.18 
and 12.19. 

Dct_coefficient_next 

Up to 63 optional variable-length code- 
words present only for I, P, and B frames. They 
are the DCT coefficients after the first one, and 
are defined in Tables 12.18 and 12.19. 

End_of_block 

This 2-bit value (present only for I, P, and B 
frames) is used to indicate that no additional 
non-zero coefficients are present. The value of 
this parameter is “10.” 
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DCT DC Differential 


Size 


Code 


Code 


Additional 


(Y) 


(CbCr) 


Code 


-255 to -128 


8 


1111110 


11111110 


00000000 to 01111111 


-127 to -64 


7 


111110 


1111110 


0000000 to 0111111 


-63 to -32 


6 


11110 


111110 


000000 to 011111 


-31 to -16 


5 


1110 


11110 


00000 to 01111 


-15 to -8 


4 


110 


1110 


0000 to 0111 


-7 to -4 


3 


101 


110 


000 to Oil 


-3 to -2 


2 


01 


10 


00 to 01 


-1 


1 


00 


01 


0 


0 


0 


100 


00 




1 


1 


00 


01 


1 


2 to 3 


2 


01 


10 


10 to 11 


4 to 7 


3 


101 


110 


100 to 111 


8 to 15 


4 


110 


1110 


1000 to 1111 


16 to 31 


5 


1110 


11110 


ioooo to mil 


32 to 63 


6 


11110 


111110 


100000 to 111111 


64 to 127 


7 


111110 


1111110 


1000000 to 1111111 


128 to 255 


8 


1111110 


11111110 


10000000 to 11111111 



Table 12.16. MPEG-1 Variable-Length Code Table for dct_dc_differential. 



DCT DC 

Size 

Chrominance 


Code 


DCT DC 
Size 

Chrominance 


Code 


0 


00 


5 


11110 


1 


01 


6 


1111 10 


2 


10 


7 


1111 110 


3 


110 


8 


1111 1110 


4 


1110 





Table 12.17. MPEG-1 Variable-Length Code Table for dct_dc_size_chrominance. 
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Run 


Level 


Code 


Run 


Level 


Code 


end of block 


10 


escape 


0000 01 


0 (note 2) 


1 


1 S 


0 


5 


0010 0110 s 


0 (note 3) 


1 


Us 


0 


6 


0010 0001 s 


1 


1 


Oils 


1 


3 


0010 0101 s 


0 


2 


0100 s 


3 


2 


0010 0100 s 


2 


1 


0101s 


10 


1 


0010 0111s 


0 


3 


0010 1 s 


11 


1 


0010 0011 s 


3 


1 


0011 1 s 


12 


1 


0010 0010 s 


4 


1 


0011 0 s 


13 


1 


0010 0000 s 


1 


2 


0001 10 s 


0 


7 


0000 0010 10 s 


5 


1 


0001 11 s 


1 


4 


0000 0011 00 s 


6 


1 


0001 01 s 


2 


3 


0000 0010 11 s 


7 


1 


0001 00 s 


4 


2 


0000 0011 11 s 


0 


4 


oooo no s 


5 


2 


0000 0010 01 s 


2 


2 


0000 100 s 


14 


1 


0000 0011 10 s 


8 


1 


0000 111 s 


15 


1 


0000 0011 01 s 


9 


1 


0000 101 s 


16 


1 


0000 0010 00 s 



Notes: 

1. s = sign of level; “0” for positive; “1” for negative. 

2. Used for dct_coefficient Jirst 

3. Used for dct_coefficient_next. 

Table 12.18a. MPEG-1 Variable-Length Code Table for dct_coefficient_first and 
dct coefficient next. 
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Run 


Level 


Code 


Run 


Level 


Code 


0 


8 


0000 0001 1101 s 


0 


12 


0000 0000 1101 0 s 


0 


9 


0000 0001 1000 s 


0 


13 


0000 0000 1100 1 s 


0 


10 


0000 0001 0011 s 


0 


14 


0000 0000 1100 0 s 


0 


11 


0000 0001 0000 s 


0 


15 


0000 0000 1011 1 s 


1 


5 


0000 0001 1011 s 


1 


6 


0000 0000 1011 0 s 


2 


4 


0000 0001 0100 s 


1 


7 


0000 0000 1010 1 s 


3 


3 


0000 0001 1100 s 


2 


5 


0000 0000 1010 0 s 


4 


3 


0000 0001 0010 s 


3 


4 


0000 0000 1001 1 s 


6 


2 


0000 0001 1110 s 


5 


3 


0000 0000 1001 0 s 


7 


2 


0000 0001 0101 s 


9 


2 


0000 0000 1000 1 s 


8 


2 


0000 0001 0001 s 


10 


2 


0000 0000 1000 0 s 


17 


1 


0000 0001 1111 s 


22 


1 


0000 0000 1111 1 s 


18 


1 


0000 0001 1010 s 


23 


1 


0000 0000 1111 0 s 


19 


1 


0000 0001 1001 s 


24 


1 


0000 0000 1110 1 s 


20 


1 


0000 0001 0111 s 


25 


1 


0000 0000 1110 0 s 


21 


1 


0000 0001 0110 s 


26 


1 


0000 0000 1101 1 s 



Note: 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 12.18b. MPEG-1 Variable-Length Code Table for dct_coefficient_first and 
dct coefficient next. 
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Run 


Level 


Code 


Run 


Level 


Code 


0 


16 


0000 0000 0111 11 s 


0 


40 


0000 0000 0010 000 s 


0 


17 


0000 0000 0111 10 s 


1 


8 


0000 0000 0011 111s 


0 


18 


0000 0000 0111 01 s 


1 


9 


0000 0000 0011 110 s 


0 


19 


0000 0000 0111 00 s 


1 


10 


0000 0000 0011 101 s 


0 


20 


0000 0000 0110 11 s 


1 


11 


0000 0000 0011 100 s 


0 


21 


0000 0000 0110 10 s 


1 


12 


0000 0000 0011 Oil s 


0 


22 


0000 0000 0110 01 s 


1 


13 


0000 0000 0011 010 s 


0 


23 


0000 0000 0110 00 s 


1 


14 


0000 0000 0011 001 s 


0 


24 


0000 0000 0101 11 s 


1 


15 


0000 0000 0001 0011 s 


0 


25 


0000 0000 0101 10 s 


1 


16 


0000 0000 0001 0010 s 


0 


26 


0000 0000 0101 01 s 


1 


17 


0000 0000 0001 0001 s 


0 


27 


0000 0000 0101 00 s 


1 


18 


0000 0000 0001 0000 s 


0 


28 


0000 0000 0100 11 s 


6 


3 


0000 0000 0001 0100 s 


0 


29 


0000 0000 0100 10 s 


11 


2 


0000 0000 0001 1010 s 


0 


30 


0000 0000 0100 01 s 


12 


2 


0000 0000 0001 1001 s 


0 


31 


0000 0000 0100 00 s 


13 


2 


0000 0000 0001 1000 s 


0 


32 


0000 0000 0011 000 s 


14 


2 


0000 0000 0001 0111 s 


0 


33 


0000 0000 0010111s 


15 


2 


0000 0000 0001 0110 s 


0 


34 


0000 0000 0010 110 s 


16 


2 


0000 0000 0001 0101 s 


0 


35 


0000 0000 0010 101 s 


27 


1 


0000 0000 0001 1111 s 


0 


36 


0000 0000 0010 100 s 


28 


1 


0000 0000 0001 1110 s 


0 


37 


0000 0000 0010 011s 


29 


1 


0000 0000 0001 1101 s 


0 


38 


0000 0000 0010 010 s 


30 


1 


0000 0000 0001 1100 s 


0 


39 


0000 0000 0010 001 s 


31 


1 


0000 0000 0001 1011 s 



Note : 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 12.18c. MPEG-1 Variable-Length Code Table for dct_coefficient_first and 
dct coefficient next. 
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Run 


Level 


Fixed Length 
Code 


0 




0000 00 


1 




0000 01 


2 




0000 10 








63 




1111 11 




-256 


forbidden 




-255 


1000 0000 0000 0001 




-254 


1000 0000 0000 0010 










-129 


1000 0000 0111 1111 




-128 


1000 0000 1000 0000 




-127 


1000 0001 




-126 


1000 0010 










-2 


1111 1110 




-1 


1111 1111 




0 


forbidden 




1 


0000 0001 










127 


0111 1111 




128 


0000 0000 1000 0000 




129 


0000 0000 1000 0001 










255 


0000 0000 1111 1111 



Table 12.19. Run, Level Encoding Following an Escape Code for 
dct coefficient first and dct coefficient next. 
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System Bitstream 

The system bitstream multiplexes the 
audio and video bitstreams into a single bit- 
stream, and formats it with control information 
into a specific protocol as defined by MPEG-1. 

Packet data may contain either audio or 
video information. Up to 32 audio and 16 video 
streams may be multiplexed together. Two 
types of private data streams are also sup- 
ported. One type is completely private; the 
other is used to support synchronization and 
buffer management. 

Maximum packet sizes usually are about 
2048 bytes, although much larger sizes are 
supported. When stored on CD-ROM, the 
length of the packs coincides with the sectors. 
Typically, there is one audio packet for every 
six or seven video packets. 

Figure 12.6 illustrates the system bit- 
stream, a hierarchical structure with three lay- 
ers. From top to bottom the layers are: 

ISO/IEC 11172 Layer 

Pack 

Packet 

Note that start codes (OxOOOOOlxx) must 
be byte aligned by inserting 0-7 “0” bits before 
the start code. 

ISO/IEC 11172 Layer 

ISO_l 1 1 72_end_code 

This 32-bit field has a value of 0x000001B9 
and terminates a system bitstream. 



Pack Layer 

Data for each pack consists of a pack 
header followed by a system header (optional) 
and packet data. The structure is shown in Fig- 
ure 12.6. 

Pack_start_code 

This 32-bit field has a value of OxOOOOOlBA 
and identifies the start of a pack. 

Fixed_bits 

These four bits always have a value of 
“ 0010 .” 

System_clock_reference_32-30 

The system _clock_reference (SCR) is a 33- 
bit number coded using three fields separated 
by marker bits. 

System jclock_referen.ce indicates the 
intended time of arrival of the last byte of the 
system_clock_reference field at the input of the 
decoder. The value of system_clock_reference is 
the number of 90 kHz clock periods. 

Markerjnt 

This bit always has a value of “1.” 

System_clock_reference_29-l 5 
Markerjnt 

This bit always has a value of “1.” 
System_clock_reference_l 4-0 
Markerjnt 

This bit always has a value of “1.” 

Markerjnt 

This bit always has a value of “1.” 
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PACK 






PACK 






PACK 






ISO 


START 


PACK 




START 


PACK 


... 


START 


PACK 




11172 


CODE 






CODE 






CODE 






END CODE 




PACK LAYER 



PACKET LAYER 



SYSTEM 

CLOCK 

REFERENCE 



MUX 

RATE 



SYSTEM 

HEADER 



PACKET 

N 



PACKET 
N + 1 




STD 

DATA 




Figure 12.6. MPEG-1 System Bitstream Layer Structures. Marker and reserved bits not shown. 



Mux_rate 

This 22-bit binary number specifies the 
rate at which the decoder receives the bit- 
stream. It specifies units of 50 bytes per sec- 
ond, rounded upwards. A value of zero is not 
allowed. 

Marker_bit 

This bit always has a value of “1.” 

System Header 

System_header_start_code 

This 32-bit field has a value of OxOOOOOIBB 
and identifies the start of a system header. 

Headerjength 

This 16-bit binary number specifies the 
number of bytes in the system header follow- 
ing headerjength. 



Markerjnt 

This bit always has a value of “1.” 

Rate_bound 

This 22-bit binary number specifies an 
integer value greater than or equal to the maxi- 
mum value of muxjrate. It may be used by the 
decoder to determine if it is capable of decod- 
ing the entire bitstream. 

Markerjnt 

This bit always has a value of “1.” 

Audio_bound 

This 6-bit binary number, with a range of 
0-32, specifies an integer value greater than or 
equal to the maximum number of simulta- 
neously active audio streams. 

Fixed_flag 

This bit specifies fixed bit-rate (“1”) or vari- 
able bit-rate (“0”) operation. 
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CSPSJlag 

This bit specifies whether the bitstream is 
a constrained system parameter stream (“1”) 
or not (“0”) . 

System_audio_lock_flag 

This bit has a value of “1” if there is a con- 
stant relationship between the audio sampling 
rate and the decoder’s system clock frequency. 

System_video_lock_flag 

This bit has a value of “1” if there is a con- 
stant relationship between the video picture 
rate and the decoder’s system clock frequency. 

Marker_bit 

This bit always has a value of “1.” 

Video_bound 

This 5-bit binary number, with a range of 
0-16, specifies an integer value greater than or 
equal to the maximum number of simulta- 
neously active video streams. 

Reserved_byte 

These eight bits always have a value of 
‘Till 1111.” 



Stream_ID 

This optional 8-bit field, as defined in Table 
12.20, indicates the type and stream number to 
which the following STD_buffer_bound_scale 
and STD_buffer_size_bound fields refer to. 
Each audio and video stream present in the 
system bitstream must be specified only once 
in each system header. 

Fixed_bits 

This optional 2-bit field has a value of “11.” 
It is present only if stream _ID is present. 



Stream Type 


Stream 

ID 


all audio streams 


1011 1000 


all video streams 


1011 1001 


reserved stream 


1011 1100 


private stream 1 


1011 1101 


padding stream 


1011 1110 


private stream 2 


1011 1111 


audio stream number xxxxx 


llOxxxxx 


video stream number xxxx 


lllOxxxx 


reserved data stream number xxxx 


1111 xxxx 



Table 12.20. MPEG-1 stream JD Code. 



STD_buffer_bound_scale 

This optional l-bit field specifies the scal- 
ing factor used to interpret 
STD_buffer_size_bound. For an audio stream, it 
has a value of “0.” For a video stream, it has a 
value of “1.” For other stream types, it can be 
either a “0” or a “1.” It is present only if 
stream_ID is present. 

STD_buffer_size_bound 

This optional 13-bit binary number speci- 
fies a value greater than or equal to the maxi- 
mum decoder input buffer size. If 
STD_buffer_bound_scale = “0,” then 

STD_buffer_size_bound measures the size in 
units of 128 bytes. If STD_buffer_bound_scale = 
“1,” then STD_buffer_size_bound measures the 
size in units of 1024 bytes. It is present only if 
streamJD is present. 





System Bitstream 573 



Packet Layer 

Packet_start_code_prefix 

This 24-bit field has a value of 0x000001. 
Together with the stream_ID that follows, it 
indicates the start of a packet. 

Stream_ID 

This 8-bit binary number specifies the type 
and number of the bitstream present, as 
defined in Table 12.20. 

Packetlength 

This 16-bit binary number specifies the 
number of bytes in the packet after the 
packet Jength field. 

Stuffing_byte 

This optional parameter has a value of 
‘Till 1111.” Up to 16 consecutive stuffing _bytes 
may be used to meet the requirements of the 
storage medium. It is present only if stream_ID 
^ private stream 2. 



STD_bits 

These optional two bits have a value of “01” 
and indicate that the STD_buffer_scale and 
STD_buffer_size fields follow. This field may be 
present only if stream_ID ^ private stream 2. 

STD_buffer_scale 

This optional l-bit field specifies the scal- 
ing factor used to interpret STD _buffer_size. 
For an audio stream, it has a value of “0.” For a 
video stream, it has a value of “1.” For other 
stream types, it can be either a “0” or a “1.” 
This field is present only if STD_bits is present 
and stream _ID ^ private stream 2. 



STD_buffer_size 

This optional 13-bit binary number speci- 
fies the size of the decoder input buffer. If 
STD _buffer_scale = “0,” then STD_buffer_size 
measures the size in units of 128 bytes. If 
STD_buffer_scale = “1,” then STD _buffer_size 
measures the size in units of 1024 bytes. This 
field is present only if STD_bits is present and 
stream_ID ^ private stream 2. 



PTS.bits 

These optional 4 bits have a value of “0010” 
and indicate the following presentation time 
stamps are present. This field may be present 
only if stream _ID ^ private stream 2. 

Presentation_time_stamp_32-3 0 

The optional presentation_time_stamp 
(PTS) is a 33-bit number coded using three 
fields, separated by marker bits. PTS indicates 
the intended time of display by the decoder. 
The value of PTS is the number of periods of a 
90 kHz system clock. This field is present only 
if PTS_bits is present and stream_ID private 
stream 2. 

Marker_bit 

This optional bit always has a value of “1.” 
It is present only if PTS_bits is present and 
stream_ID ^ private stream 2. 

Presentation_time_stamp_29-l 5 

This optional field is present only if 
PTS_bits is present and stream_ID ^ private 
stream 2. 

Marker_bit 

This optional bit always has a value of “1.” 
It is present only if PTS_bits is present and 
stream_ID ^ private stream 2. 
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Presentation_time_stamp_l 4-0 

This optional field is present only if 
PTSjbits is present and stream_ID ^ private 
stream 2. 

Marker_bit 

This optional l-bit field always has a value 
of “1.” It is present only if PTSjbits is present 
and stream _ID J private stream 2. 



DTS_bits 

These optional 4 bits have a value of “0011” 
and indicate the following presentation and 
decoding time stamps are present. This field 
may be present only if stream_ID J private 
stream 2. 

Presentation_time_stamp_32-30 

The optional presentation _time_stamp 
(PTS) is a 33-bit number coded using three 
fields, separated by marker bits. PTS indicates 
the intended time of display by the decoder. 
The value of PTS is the number of periods of a 
90 kHz system clock. This field is present only 
if DTS_bits is present and stream_ID J private 
stream 2. 

Marker_bit 

This optional 1-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream _ID J private stream 2. 

Presentation_time_stamp_29-l 5 

This optional field is present only if 
DTSjbits is present and stream_ID J private 
stream 2. 

Marker_bit 

This optional 1-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream _ID J private stream 2. 



Presentation_time_stamp_ 1 4-0 

This optional field is present only if 
DTS_bits is present and stream_ID J private 
stream 2. 

Marker_bit 

This optional 1-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream_ID J private stream 2. 

Fixed_bits 

This optional 4-bit field has a value of 
“0001.” It is present only if DTS_bits is present 
and stream_ID J private stream 2. 

Decoding_time_stamp_32-30 

The optional decoding_time_stamp (DTS) 
is a 33-bit number coded using three fields, 
separated by marker bits. DTS indicates the 
intended time of decoding by the decoder of 
the first access unit that commences in the 
packet. The value of DTS is the number of peri- 
ods of a 90 kHz system clock. It is present only 
if DTS_bits is present and stream_ID J private 
stream 2. 

Markerjnt 

This optional 1-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream_ID J private stream 2. 

Decoding_time_stamp_29-l 5 

This optional field is present only if 
DTS_bits is present and stream_ID J private 
stream 2. 

Markerjnt 

This optional 1-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream_ID J private stream 2. 




Video Decoding 575 



Decoding_time_stamp_ 1 4-0 

This optional field is present only if 
DTSjbits is present and stream_ID V private 
stream 2. 

Marker_bit 

This optional l-bit field always has a value 
of “1.” It is present only if DTS_bits is present 
and stream _ID V private stream 2. 



NonPTS_nonDTS_bits 

These optional 8 bits have a value of “0000 
1111” and are present if the STD_bits field, 
PTSjbits field, or DTSjbits field (and their cor- 
responding following fields) are not present. 



Packet_data_byte 

This is [n] bytes of data from the bitstream 
specified by the packet layer stream_ID. The 
number of data bytes may be determined from 
the packet_length parameter. 

Video Decoding 

A system demultiplexer parses the system 
bitstream, demultiplexing the audio and video 
bitstreams. 

The video decoder essentially performs 
the inverse of the encoder. From the coded 
video bitstream, it reconstructs the I frames. 
Using I frames, additional coded data, and 
motion vectors, the P and B frames are gener- 
ated. Finally, the frames are output in the 
proper order. 



Fast Playback Considerations 

Fast forward operation can be imple- 
mented by using D frames or the decoding 
only of I frames. However, decoding only I 
frames at the faster rate places a major burden 
on the transmission medium and the decoder. 

Alternately, the source may be able to sort 
out the desired I frames and transmit just 
those frames, allowing the bit-rate to remain 
constant. 

Pause Mode Considerations 

This requires the decoder to be able to 
control the incoming bitstream. If it doesn’t, 
when playback resumes there may be a delay 
and skipped frames. 

Reverse Playback Considerations 

This requires the decoder to be able to 
decode each group of pictures in the forward 
direction, store them, and display them in 
reverse order. To minimize the storage 
requirements of the decoder, groups of pic- 
tures should be small or the frames may be 
reordered. Reordering can be done by trans- 
mitting frames in another order or by reorder- 
ing the coded pictures in the decoder buffer. 

Decode Postprocessing 

The SIF data usually is converted to 720 x 
480i or 720 x 576i. Suggested upsampling fil- 
ters are discussed in the MPEG-1 specification. 
The original decoded lines correspond to Field 
1. Field 2 uses interpolated lines. 
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Real-World Issues 

System Bitstream Termination 

A common error is the improper place- 
ment of sequence _end_code in the system bit- 
stream. When this happens, some decoders 
may not know that the end of the video 
occurred, and output garbage. 

Another problem occurs when a system bit- 
stream is shortened just by eliminating trailing 
frames, removing sequence_end_code altogether. 
In this case, the decoder may be unsure when 
to stop. 

Timecodes 

Since some decoders rely on the timecode 
information, it should be implemented. To min- 
imize problems, the video bitstream should 
start with a timecode of zero and increment by 
one each frame. 

Variable Bit-Rates 

Although variable bit-rates are supported, 
a constant bit-rate should be used if possible. 
Since vbv_delay doesn’t make sense for a vari- 
able bit-rate, the MPEG-1 standard specifies 
that it be set to the maximum value. 

However, some decoders use vbvjdelay 
with variable bit-rates. This could result in a 2- 
3 second delay before starting video, causing 
the first 60-90 frames to be skipped. 

Constrained Bitstreams 

Most MPEG-1 decoders can handle only 
the constrained parameters subset of MPEG-1. 
To ensure maximum compatibility, only the 
constrained parameters subset should be used. 



Source Sample Clock 

Good compression with few artifacts 
requires a video source that generates or uses 
a very stable sample clock. This ensures that 
the vertical alignment of samples over the 
entire picture is maintained. With poorly 
designed sample clock generation, the arti- 
facts usually get worse towards the right side 
of the picture. 
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MPEG-2 



MPEG-2 extends MPEG-1 to cover a wider 
range of applications. The MPEG-1 chapter 
should be reviewed to become familiar with 
the basics of MPEG before reading this chap- 
ter. 

The primary application targeted during 
the definition process was all-digital transmis- 
sion of broadcast-quality video at bit-rates of 4- 
9 Mbps. However, MPEG-2 is useful for many 
other applications, such as HDTV, and now 
supports bit-rates of 1.5-60 Mbps. 

MPEG-2 is an ISO standard (ISO/IEC 
13818), and consists of eleven parts: 



systems 

video 

audio 

conformance testing 
software simulation 
DSM-CC extensions 
advanced audio coding 
RTI extension 
DSM-CC conformance 
IPMP 



ISO/IEC 13818-1 
ISO/IEC 13818-2 
ISO/IEC 13818-3 
ISO/IEC 13818-4 
ISO/IEC 13818-5 
ISO/IEC 13818-6 
ISO/IEC 13818-7 
ISO/IEC 13818-9 
ISO/IEC 13818-10 
ISO/IEC 13818-11 



As with MPEG-1, the compressed bit- 
streams implicitly define the decompression 
algorithms. The compression algorithms are 
up to the individual manufacturers, within the 
scope of an international standard. 



The Digital Storage Media Command and 
Control (DSM-CC) extension (ISO/IEC 
13818-6) is a toolkit for developing control 
channels associated with MPEG-2 streams. In 
addition to providing VCR-type features such 
as fast-forward, rewind, pause, etc., it may be 
used for a wide variety of other purposes, such 
as packet data transport. DSM-CC works in 
conjunction with next-generation packet net- 
works, working alongside Internet protocols as 
RSVP, RTSP, RTP, and SCP 

The Real Time Interface (RTI) extension 
(ISO/IEC 13818-9) defines a common inter- 
face point to which terminal equipment manu- 
facturers and network operators can design. 
RTI specifies a delivery model for the bytes of 
an MPEG-2 System stream at the input of a 
real decoder, whereas MPEG-2 System defines 
an idealized byte delivery schedule. 

IPMP (Intellectual Property Management 
and Protection) is a digital rights management 
(DRM) standard, adapted from the MPEG-4 
IPMP extension specification. Rather than a 
complete system, a variety of functions are pro- 
vided within a framework. 
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Audio Overview 

In addition to the non-backwards-compati- 
ble audio extension (ISO/IEC 13818-7), 
MPEG-2 supports up to five full-bandwidth 
channels compatible with MPEG-1 audio cod- 
ing. It also extends the coding of MPEG-1 
audio to half sampling rates (16 kHz, 22.05 
kHz, and 24 kHz) for improved quality for bit- 
rates at or below 64 kbps per channel. 

MPEG-2.5 is an unofficial, yet common, 
extension to the audio capabilities of MPEG-2. 
It adds sampling rates of 8 kHz, 11.025 kHz, 
and 12 kHz. 



Video Overview 

With MPEG-2, profiles specify the syntax 
(i.e., algorithms) and levels specify various 
parameters (resolution, frame rate, bit-rate, 
etc.). Main Profile@Main Level is targeted for 
SDTV applications, while Main Profile@High 
Level is targeted for HDTV applications. 

Levels 

MPEG-2 supports four levels, which spec- 
ify resolution, frame rate, coded bit-rate, and so 
on for a given profile. 

Low Level (LL) 

MPEG-1 Constrained Parameters Bit- 
stream (CPB) , supporting up to 352 x 288 at up 
to 30 frames per second. Maximum bit-rate is 4 
Mbps. 

Main Level (ML) 

MPEG-2 Constrained Parameters Bit- 
stream (CPB) supports up to 720 x 576 at up to 
30 frames per second and is intended for SDTV 
applications. Maximum bit-rate is 15-20 Mbps. 



High 1440 Level 

This level supports up to 1440 x 1088 at up 
to 60 frames per second and is intended for 
HDTV applications. Maximum bit-rate is 60-80 
Mbps. 

High Level (HL) 

High Level supports up to 1920 x 1088 at 
up to 60 frames per second and is intended for 
HDTV applications. Maximum bit-rate is 80- 
100 Mbps. 

Profiles 

MPEG-2 supports six profiles, which spec- 
ify which coding syntax (algorithms) is used. 
Tables 13.1 through 13.8 illustrate the various 
combinations of levels and profiles allowed. 

Simple Profile (SP) 

Main profile without the B frames, 
intended for software applications and perhaps 
digital cable TV. 

Main Profile (MP) 

Supported by most MPEG-2 decoder 
chips, it should satisfy 90% of the consumer 
SDTV and HDTV applications. Typical resolu- 
tions are shown in Table 13.6. 

Multiview Profile (MVP) 

By using existing MPEG-2 tools, it is possi- 
ble to encode video from two cameras shooting 
the same scene with a small angle difference. 

4:2:2 Profile (422P) 

Previously known as “studio profile,” this 
profile uses 4:2:2 YCbCr instead of 4:2:0, and 
with main level, increases the maximum bit- 
rate up to 50 Mbps (300 Mbps with high level) . 
It was added to support pro-video SDTV and 
HDTV requirements. 
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Level 


Profile 


Nonscalable 


Scalable 


Simple 


Main 


Multiview 


4:2:2 


SNR 


Spatial 


High 


high 


- 


yes 


- 


yes 


- 


- 


yes 


high 1440 


- 


yes 


- 


- 


- 


yes 


yes 


main 


yes 


yes 


yes 


yes 


yes 


- 


yes 


low 


- 


yes 


- 


- 


yes 


- 


- 



Table 13.1. MPEG-2 Acceptable Combinations of Levels and Profiles. 



Constraint 


Profile 


Nonscalable 


Scalable 


Simple 


Main 


Multiview 


4:2:2 


SNR 


Spatial 


High 


chroma format 


4:2:0 


4:2:0 


4:2:0 


4:2:0 or 
4:2:2 


4:2:0 


4:2:0 


4:2:0 or 
4:2:2 


picture types 


I,P 


I, P, B 


I, P, B 


I, P, B 


I, P, B 


I, P,B 


I, P B 


scalable modes 


- 


- 


Temporal 


- 


SNR 


SNR or 
Spatial 


SNR or 
Spatial 


intra dc 
precision (bits) 


8, 9, 10 


8, 9, 10 


8, 9, 10 


8, 9, 10, 11 


8, 9, 10 


8, 9, 10 


8, 9, 10, 11 


sequence scalable 
extension 


no 


no 


yes 


no 


yes 


yes 


yes 


picture spatial 
scalable extension 


no 


no 


no 


no 


no 


yes 


yes 


picture temporal 
scalable extension 


no 


no 


yes 


no 


no 


no 


no 


repeat first 
field 


constrained 


unconstrained 


constrained 


unconstrained 



Table 13.2. Some MPEG-2 Profile Constraints. 
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Level 


Maximum Number of 


Profile 


Layers 


SNR 


Spatial 


High 


Multiview 




All layers (base + enhancement) 






3 


2 


high 


Spatial enhancement layers 
SNR enhancement layers 


- 


- 


1 

1 


0 

0 




Temporal auxiliary layers 






0 


1 




All layers (base + enhancement) 




3 


3 


2 


high 1440 


Spatial enhancement layers 
SNR enhancement layers 


- 


1 

1 


1 

1 


0 

0 




Temporal auxiliary layers 




0 


0 


1 




All layers (base + enhancement) 


2 




3 


2 


main 


Spatial enhancement layers 
SNR enhancement layers 


0 

1 


- 


1 

1 


0 

0 




Temporal auxiliary layers 


0 




0 


1 




All layers (base + enhancement) 


2 






2 


low 


Spatial enhancement layers 


0 






0 


SNR enhancement layers 


1 






0 




Temporal auxiliary layers 


0 






1 



Table 13.3. MPEG-2 Number of Permissible Layers for Scalable Profiles. 



Profile 


Profile 


Profile at Level 
for Base 
Decoder 


Base 

Layer 


Enhancement 
Layer 1 


Enhancement 
Layer 2 


SNR 


4:2:0 


SNR, 4:2:0 


- 


MP@same level 


spatial 


4:2:0 


SNR, 4:2:0 


- 


MP@same level 


4:2:0 


Spatial, 4:2:0 


- 


MP@ (level-1) 


4:2:0 


SNR, 4:2:0 


Spatial, 4:2:0 


4:2:0 


Spatial, 4:2:0 


SNR, 4:2:0 


high 


4:2:0 or 4:2:2 


- 


- 


HP@same level 


4:2:0 


SNR, 4:2:0 


- 


4:2:0 or 4:2:2 


SNR, 4:2:2 


- 


4:2:0 


Spatial, 4:2:0 


- 


HP@ (level-1) 


4:2:0 or 4:2:2 


Spatial, 4:2:2 


- 


4:2:0 


SNR, 4:2:0 


Spatial, 4:2:0 or 4:2:2 


4:2:0 or 4:2:2 


SNR, 4:2:2 


Spatial, 4:2:2 


4:2:0 


Spatial, 4:2:0 


SNR, 4:2:0 or 4:2:2 


4:2:0 


Spatial, 4:2:2 


SNR, 4:2:2 


4:2:2 


Spatial, 4:2:2 


SNR, 4:2:2 


multiview 


4:2:0 


Temporal, 4:2:0 


- 


MP@same level 



Table 13.4. Some MPEG-2 Video Decoder Requirements for Various Profiles. 
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Level 


Spatial 

Resolution 

Layer 


Parameter 


Profile 


Simple 


Main 


Multiview 


4:2:2 


SNR/ 

Spatial 


High 






Samples per line 




1920 


1920 


1920 




1920 




Enhancement 


Lines per frame 


- 


1088 


1088 


1088 


- 


1088 


high 




Frames per second 




60 


60 


60 




60 




Samples per line 






1920 






960 




Lower 


Lines per frame 


- 


- 


1088 


- 


- 


576 






Frames per second 






60 






30 






Samples per line 




1440 


1440 




1440 


1440 




Enhancement 


Lines per frame 


- 


1088 


1088 


- 


1088 


1088 


high 




Frames per second 




60 


60 




60 


60 


1440 




Samples per line 






1440 




720 


720 




Lower 


Lines per frame 


- 


- 


1088 


- 


576 


576 






Frames per second 






60 




30 


30 






Samples per line 


720 


720 


720 


720 


720 


720 




Enhancement 


Lines per frame 


576 


576 


576 


608 


576 


576 






Frames per second 


30 


30 


30 


30 


30 


30 


main 






















Samples per line 






720 






352 




Lower 


Lines per frame 


- 


- 


576 


- 


- 


288 






Frames per second 






30 






30 






Samples per line 




352 


352 




352 






Enhancement 


Lines per frame 


- 


288 


288 


- 


288 


- 






Frames per second 




30 


30 




30 




low 




Samples per line 






352 










Lower 


Lines per frame 


- 


- 


288 


- 


- 


- 






Frames per second 






30 









Note : 

1. The above levels and profiles that originally specified 1152 maximum lines per frame were 
changed to 1088 lines per frame. 



Table 13.5. MPEG-2 Upper Limits of Resolution and Temporal Parameters. In the case of 
single layer or SNR scalability coding, the “Enhancement Layer” parameters apply. 
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Level 


Maximum 

Bit-Rate 

(Mbps) 


Typical 

Active 

Resolutions 


Frame Rate (Hz) 2 


Q. 

CO 

1^ 

<7> 

00 

CM 


Q. 

CM 


a. 

10 

CM 


Q. 

r^ 

O) 

o> 

CM 


Q. 

o 

00 


Q. 

o 

LO 


a. 

■et 

0> 

a> 

lO 


a. 

o 

<o 


io 

CM 


<7> 

a> 

CM 


o 

00 


high 


80 

(100 for High Profile) 
(300 for 4:2:2 Profile) 


1920 x 1080 1 


X 


X 


X 


X 


X 








X 


X 


X 


high 

1440 


60 

(80 for High Profile) 


1280 x 720 


X 


X 


X 


X 


X 


X 


X 


X 








960 x 1080 1 


X 


X 


X 


X 


X 








X 


X 


X 


1280 x 1080 1 


X 


X 


X 


X 


X 








X 


X 


X 


1440 x 1080 1 


X 


X 


X 


X 


X 








X 


X 


X 


main 


15 

(20 for High Profile) 
(50 for 4:2:2 Profile) 


352 x 480 


X 


X 




X 


X 




X 


X 




X 


X 


352 x 576 




X 


X 






X 






X 






480 x 480 


X 


X 




X 


X 




X 


X 




X 


X 


544 x 480 


X 


X 




X 


X 




X 


X 




X 


X 


544 x 576 




X 


X 






X 






X 






640 x 480 


X 


X 




X 


X 




X 


X 




X 


X 


704 x 480, 720 x 480 


X 


X 




X 


X 




X 


X 




X 


X 


704 x 576, 720 x 576 




X 


X 






X 






X 






low 


4 


320 x 240 


X 


X 




X 


X 




X 


X 




X 


X 


352 x 240 


X 


X 




X 


X 




X 


X 




X 


X 


352 x 288 




X 


X 






X 






X 







Notes : 

1. The video coding system requires that the number of active scan lines be a multiple of 32 for 
interlaced pictures, and a multiple of 16 for progressive pictures. Thus, for the 1080-line inter- 
laced format, the video encoder and decoder must actually use 1088 lines. The extra eight 
lines are “dummy" lines having no content, and designers choose dummy data that simplifies 
the implementation. The extra eight lines are always the last eight lines of the encoded 
image. These dummy lines do not carry useful information, but add little to the data required 
for transmission. 

2. p = progressive; i = interlaced. 



Table 13.6. Example Levels and Resolutions for MPEG-2 Main Profile. 
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Level 


Spatial 

Resolution 

Layer 


Profile 


Simple 


Main 


Multiview 


SNR / 
Spatial 


High 


4:2:2 


high 


Enhancement 


- 


62.668800 


62.668800 


- 


62.668800 (4:2:2) 
83.558400 (4:2:0) 


62.668800 


Lower 


- 


- 


62.668800 


- 


14.745600 (4:2:2) 
19.660800 (4:2:0) 


- 


high 

1440 


Enhancement 


- 


47.001600 


47.001600 


47.001600 


47.001600 (4:2:2) 
62.668800 (4:2:0) 


- 


Lower 


- 


- 


47.001600 


10.368000 


11.059200 (4:2:2) 
14.745600 (4:2:0) 


- 


main 


Enhancement 


10.368000 


10.368000 


10.368000 


10.368000 


11.059200 (4:2:2) 
14.745600 (4:2:0) 


11.059200 


Lower 


- 


- 


10.368000 


- 


3.041280 (4:2:0) 


- 


low 


Enhancement 


- 


3.041280 


3.041280 


3.041280 


- 


- 


Lower 


- 


- 


3.041280 


- 


- 


- 



Table 13.7. MPEG-2 Upper Limits for Y Sample Rate (Msamples/second). In the case of 
single layer or SNR scalability coding, the “Enhancement Layer” parameters apply. 



Level 


Profile 


Nonscalable 


Scalable 


Simple 


Main 


Multiview 


4:2:2 


SNR/Spatial 


High 


high 


- 


80 


130 (both layers) 
80 (base layer) 


300 


- 


100 (all layers) 

80 (middle + base layers) 
25 (base layer) 


high 1440 


- 


60 


100 (both layers) 
60 (base layer) 


- 


60 (all layers) 

40 (middle + base layers) 
15 (base layer) 


80 (all layers) 

60 (middle + base layers) 
20 (base layer) 


main 


15 


15 


25 (both layers) 
5 (base layer) 


50 


15 (both layers) 
10 (base layer) 


20 (all layers) 

15 (middle + base layers) 
4 (base layer) 


low 


- 


4 


8 (both layers) 
4 (base layer) 


- 


4 (both layers) 
3 (base layer) 


- 



Table 13.8. MPEG-2 Upper Limits for Bit-Rates (Mbps). 
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SNR and Spatial Profiles 

Adds support for SNR scalability and/or 
spatial scalability. 

High Profile (HP) 

Targeted for pro-video HDTV applications. 

Scalability 

The MPEG-2 SNR, Spatial, and High pro- 
files support four scalable modes of operation. 
These modes break MPEG-2 video into layers 
for the purpose of prioritizing video data. Seal- 
ability is not commonly used since efficiency 
decreases by about 2 dB (or about 30% more 
bits are required) . 

SNR Scalability 

This mode is targeted for applications that 
desire multiple quality levels. All layers have 
the same spatial resolution. The base layer pro- 
vides the basic video quality. The enhancement 
layer increases the video quality by providing 
refinement data for the DCT coefficients of the 
base layer. 

Spatial Scalability 

Useful for simulcasting, each layer has a 
different spatial resolution. The base layer pro- 
vides the basic spatial resolution and temporal 
rate. The enhancement layer uses the spatially 
interpolated base layer to increase the spatial 
resolution. For example, the base layer may 
implement 352 x 240 resolution video, with the 
enhancement layers used to generate 704 x 
480 resolution video. 



Temporal Scalability 

This mode allows migration from low tem- 
poral rate to higher temporal rate systems. The 
base layer provides the basic temporal rate. 
The enhancement layer uses temporal predic- 
tion relative to the base layer. The base and 
enhancement layers can be combined to pro- 
duce a full temporal rate output. All layers have 
the same spatial resolution and chroma for- 
mats. In case of errors in the enhancement lay- 
ers, the base layer can be used for 
concealment. 

Data Partitioning 

This mode is targeted for cell loss resil- 
ience in ATM networks. It breaks the 64 quan- 
tized transform coefficients into two 
bitstreams. The higher priority bitstream con- 
tains critical lower-frequency DCT coefficients 
and side information such as headers and 
motion vectors. A lower-priority bitstream car- 
ries higher-frequency DCT coefficients that 
add detail. 

Transport and Program Streams 

The MPEG-2 Systems Standard specifies 
two methods for multiplexing the audio, video, 
and other data into a format suitable for trans- 
mission and storage. 

The program stream is designed for appli- 
cations where errors are unlikely. It contains 
audio, video, and data bitstreams (also called 
elementary bitstreams) all merged into a single 
bitstream. The program stream, as well as 
each of the elementary bitstreams, may be a 
fixed or variable bit-rate. DVDs and SVCDs use 
program streams, carrying the DVD- and 
SVCD-specific data in private data streams 
interleaved with the video and audio streams. 
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The transport stream, using fixed-size 
packets of 188 bytes, is designed for applica- 
tions where data loss is likely. Also containing 
audio, video, and data bitstreams all merged 
into a single bitstream, multiple programs can 
be carried. The ARIB, ATSC, DVB and Open- 
Cable standards use transport streams. 

Both the transport stream and program 
stream are based on a common packet struc- 
ture, facilitating common decoder implementa- 
tions and conversions. Both streams are 
designed to support a large number of known 
and anticipated applications, while retaining 
flexibility. 



Video Coding Layer 

YCbCr Color Space 

MPEG-2 uses the YCbCr color space, sup- 
porting 4:2:0, 4:2:2, and 4:4:4 sampling. The 
4:2:2 and 4:4:4 sampling options increase the 
chroma resolution over 4:2:0, resulting in bet- 
ter picture quality. 

The 4:2:0 sampling structure for MPEG-2 
is shown in Figures 3.8 through 3.10. The 4:2:2 
and 4:4:4 sampling structures are shown in 
Figures 3.2 and 3.3. 

Coded Picture Types 

There are three types of coded pictures. I 
(intra) pictures are fields or frames coded as a 
stand-alone still image. They allow random 
access points within the video stream. As such, 
I pictures should occur about two times a sec- 
ond. I pictures also should be used where 
scene cuts occur. 

P (predicted) pictures are fields or frames 
coded relative to the nearest previous I or P 
picture, resulting in forward prediction pro- 



cessing, as shown in Figure 13.1. P pictures 
provide more compression than I pictures, 
through the use of motion compensation, and 
are also a reference for B pictures and future P 
pictures. 

B (bi-directional) pictures are fields or 
frames that use the closest past and future I or 
P picture as a reference, resulting in bi-direc- 
tional prediction, as shown in Figure 13.1. B 
pictures provide the most compression, and 
decrease noise by averaging two pictures. Typ- 
ically, there are two B pictures separating I or 
P pictures. 

D (DC) pictures are not supported in 
MPEG-2, except for decoding to support back- 
wards compatibility with MPEG-1. 

A group of pictures (GOP) is a series of 
one or more coded pictures intended to assist 
in random accessing and editing. The GOP 
value is configurable during the encoding pro- 
cess. The smaller the GOP value, the better 
the response to movement (since the I pictures 
are closer together), but the lower the com- 
pression. 

In the coded bitstream, a GOP must start 
with an I picture and may be followed by any 
number of I, P, or B pictures in any order. In 
display order, a GOP must start with an I or B 
picture and end with an I or P picture. Thus, 
the smallest GOP size is a single I picture, with 
the largest size unlimited. 

Each GOP should be coded independently 
of any other GOP. However, this is not true 
unless no B pictures precede the first I picture, 
or if they do, they use only backward motion 
compensation. This results in both open and 
closed GOP formats. A closed GOP is a GOP 
that can be decoded without using pictures of 
the previous GOP for motion compensation. 
An open GOP, identified by the brokenjink flag, 
indicates that the first B pictures (if any) imme- 
diately following the first I picture after the 
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Figure 13.1. MPEG-2 I, P, and B Pictures. Some pictures can be transmitted out of sequence, 
complicating the interpolation process and requiring picture reordering by the MPEG decoder. 
Arrows show inter-frame dependencies. 



GOP header may not be decoded correctly 
(and thus not be displayed) since the reference 
picture used for prediction is not available due 
to editing. 

Motion Compensation 

Motion compensation for MPEG-2 is more 
complex due to the introduction of fields. After 
a macroblock has been compressed using 
motion compensation, it contains both the spa- 
tial difference (motion vectors) and content dif- 
ference (error terms) between the reference 
macroblock and macroblock being coded. 

The two major classifications of prediction 
are field and frame. Within field pictures, only 
field predictions are used. Within frame pic- 



tures, either field or frame predictions can be 
used (selectable at the macroblock level) . 

Motion vectors for MPEG-2 are always 
coded in half-pixel units. MPEG-1 supports 
either half-pixel or full-pixel units. 

16 x 8 Motion Compensation Option 

Two motion vectors (four for B pictures) 
per macroblock are used, one for the upper 16 
x 8 region of a macroblock and one for the 
lower 16 x 8 region of a macroblock. It is only 
used with field pictures. 

Dual-Prime Motion Compensation Option 

This is only used with P pictures that have 
no B pictures between the predicted and refer- 
ence fields of frames. One motion vector is 
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used, together with a small differential motion 
vector. All of the necessary predictions are 
derived from these. 

Macroblocks 

Three types of macroblocks are available 
in MPEG-2. 

The 4:2:0 macroblock (Figure 13.2) con- 
sists of four Y blocks, one Cb block, and one Cr 
block. The block ordering is shown in the fig- 
ure. 

The 4:2:2 macroblock (Figure 13.3) con- 
sists of four Y blocks, two Cb blocks, and two 
Cr blocks. The block ordering is shown in the 
figure. 

The 4:4:4 macroblock (Figure 13.4) con- 
sists of four Y blocks, four Cb blocks, and four 
Cr blocks. The block ordering is shown in the 
figure. 

Macroblocks in P pictures are coded using 
the closest previous I or P picture as a refer- 
ence, resulting in two possible codings: 

• intra-coding 

no motion compensation 

• forward prediction 

closest previous I or P picture is the 
reference 

Macroblocks in B pictures are coded using 
the closest previous and/or future I or P pic- 
ture as a reference, resulting in four possible 
codings: 

• intra-coding 

no motion compensation 

• forward prediction 

closest previous I or P picture is the 
reference 

• backward prediction 

closest future I or P picture is the 
reference 



• bi-directional prediction 

two pictures used as the reference: 
the closest previous I or P picture and 
the closest future I or P picture 



I Pictures 

Macroblocks 

There are ten types of macroblocks in I 
pictures, as shown in Table 13.27. 

If the [macroblock quant] column in Table 
13.27 has a “1,” the quantizer scale is transmit- 
ted. For the remaining macroblock types, the 
DCT correction is coded using the previous 
value for quantizer scale. 

If the [coded pattern] column in Table 
13.27 has a “1,” the 6-bit coded block pattern is 
transmitted as a variable-length code. This tells 
the decoder which of the six blocks in the 4:2:0 
macroblock are coded (“1”) and which are not 
coded (“0”). Table 13.32 lists the codewords 
assigned to the 63 possible combinations. 
There is no code for when none of the blocks is 
coded; it is indicated by the macroblock type. 
For 4:2:2 and 4:4:4 macroblocks, an additional 
two or six bits, respectively, are used to extend 
the coded block pattern. 

DCT 

Each 8x8 block (of input samples or pre- 
diction error terms) is processed by an 8 x 8 
DCT (discrete cosine transform), resulting in 
an 8 x 8 block of horizontal and vertical fre- 
quency coefficients, as shown in Figure 7.56. 

Input sample values are 0-255, resulting in 
a range of 0-2040 for the DC coefficient and a 
range of about -2048 to 2047 for the AC coeffi- 
cients. 

Due to spatial and SNR scalability, non- 
intra blocks (blocks within a non-intra macro- 
block) are also possible. Non-intra block coeffi- 
cients represent differences between sample 
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Figure 13.2. MPEG-2 4:2:0 Macroblock Structure. 
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Figure 13.3. MPEG-2 4:2:2 Macroblock Structure. 



CB | BLOCK S | BLOCK 9 
CR | BLOCK 4 | BLOCK 8 



BLOCK 0 


BLOCK 1 






LEFT 


RIGHT 






CB | BLOCK 7 | BLOCK 


11 


CR BLOCK 6 | BLOCK 


10 




BLOCK 2 


BLOCK 3 






LEFT 


RIGHT 







Figure 13.4. MPEG-2 4:4:4 Macroblock Structure 



Video Coding Layer 589 



values rather than actual sample values. They 
are obtained by subtracting the motion-com- 
pensated values from the previous picture 
from the values in the current macroblock. 

Quantizing 

The 8x8 block of frequency coefficients is 
uniformly quantized, limiting the number of 
allowed values. The quantizer step scale is 
derived from the quantization matrix and quan- 
tizer scale and may be different for different 
coefficients and may change between macro- 
blocks. 

Since the eye is sensitive to large luma 
areas, the quantizer step size of the DC coeffi- 
cient is selectable to 8, 9, 10, or 11 bits of preci- 
sion. The quantized DC coefficient is deter- 
mined by dividing the DC coefficient by 8, 4, 2, 
or 1 and rounding to the nearest integer. 

AC coefficients are quantized using two 
quantization matrices: one for intra macro- 
blocks and one for non-intra macroblocks. 
When using 4:2:2 or 4:4:4 data, different matri- 
ces may be used for Y and CbCr data. Each 
quantization matrix has a default set of values 
that may be overwritten. 

If the [macroblock quant] column in Table 
13.27 has a “1,” the quantizer scale is transmit- 
ted. For the remaining macroblock types, the 
DCT correction is coded using the previous 
value for quantizer scale. 

Zig-Zag Scan 

Zig-zag scanning, starting with the DC 
component, generates a linear stream of quan- 
tized frequency coefficients arranged in order 
of increasing frequency, as shown in Figures 
7.59 and 7.60. This produces long runs of zero 
coefficients. 

Coding of Quantized DC Coefficients 

After the DC coefficients have been quan- 
tized, they are losslessly coded. 



Coding of Y blocks within a macroblock fol- 
lows the order shown in Figures 13.2 through 
13.4. The DC value of block 4 is the DC predic- 
tor for block 1 of the next macroblock. At the 
beginning of each slice, whenever a macro- 
block is skipped, or whenever a non-intra mac- 
roblock is decoded, the DC predictor is set to 
128 (if 8 bits of DC precision) , 256 (if 9 bits of 
DC precision), 512 (if 10 bits of DC precision), 
or 1024 (if 11 bits of DC precision). 

The DC values of each Cb and Cr block are 
coded using the DC value of the correspond- 
ing block of the previous macroblock as a pre- 
dictor. At the beginning of each slice, 
whenever a macroblock is skipped, or when- 
ever a non-intra block is decoded, the DC pre- 
dictors are set to 128 (8 bits of DC precision) , 
256 (9 bits of DC precision), 512 (10 bits of DC 
precision), or 1024 (11 bits of DC precision). 

However, a common implementation is to 
reset the DC predictors to zero and center the 
intra-block DC terms about zero instead of the 
50% grey level. Decoders then only have to 
handle the different intra DC precisions in the 
quantizer (which already has a multiplier that 
can be used to reconstruct the right value) 
instead of the parser (which generally doesn’t 
touch that data and has no multiplier) . 

Coding of Quantized AC Coefficients 

After the AC coefficients have been quan- 
tized, they are scanned in the order shown in 
Figure 7.59 or 7.60 and coded using run-length 
and level. The scan starts in position 1, as 
shown in Figures 7.59 and 7.60, as the DC coef- 
ficient in position 0 is coded separately. 

The run-lengths and levels are coded as 
shown in Tables 13.36 and 13.37. The “s” bit 
denotes the sign of the level; “0” is positive and 
“1” is negative. For intra blocks, either Table 
13.36 or Table 13.37 may be used, as specified 
by intrajvlc Jormat in the bitstream. For non- 
intra blocks, only Table 13.36 is used. 
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For run-level combinations not shown in 
Tables 13.36 and 13.37, an escape sequence is 
used, consisting of the escape code (ESC) , fol- 
lowed by the run-length and level codes from 
Tables 13.38 and 13.39. 

After the last DCT coefficient has been 
coded, an EOB code is added to tell the 
decoder that there are no more quantized coef- 
ficients in this 8x8 block. 

P Pictures 

Macroblocks 

There are 26 types of macroblocks in P pic- 
tures, as shown in Table 13.28, due to the addi- 
tional complexity of motion compensation. 

Skipped macroblocks are present when 
the macroblock_address_increment parameter 
in the bitstream is greater than 1. For P field 
pictures, the decoder predicts from the field of 
the same parity as the field being predicted, 
motion vector predictors are set to 0, and the 
motion vector is set to 0. For P frame pictures, 
the decoder sets the motion vector predictors 
to 0, and the motion vector is set to 0. 

If the [macroblock quant] column in Table 

13.28 has a “1,” the quantizer scale is transmit- 
ted. For the remaining macroblock types, the 
DCT correction is coded using the previous 
value for quantizer scale. 

If the [motion forward] column in Table 

13.28 has a “1,” horizontal and vertical forward 
motion vectors are successively transmitted. 

If the [coded pattern] column in Table 

13.28 has a “1,” the 6-bit coded block pattern is 
transmitted as a variable-length code. This tells 
the decoder which of the six blocks in the mac- 
roblock are coded (“1”) and which are not 
coded (“0”). Table 13.32 lists the codewords 
assigned to the 63 possible combinations. 
There is no code for when none of the blocks is 
coded; it is indicated by the macroblock type. 
For intra-coded macroblocks in P and B pic- 



tures, the coded block pattern is not transmit- 
ted, but is assumed to be a value of 63 (all 
blocks are coded). For 4:2:2 and 4:4:4 macro- 
blocks, an additional two or six bits, respec- 
tively, are used to extend the coded block 
pattern. 

DCT 

Intra block AC coefficients are trans- 
formed in the same manner as they are for I 
pictures. Intra block DC coefficients are trans- 
formed differently; the predicted values are set 
to 1024, unless the previous block was intra- 
coded. 

Non-intra block coefficients represent dif- 
ferences between sample values rather than 
actual sample values. They are obtained by 
subtracting the motion-compensated values of 
the previous picture from the values in the cur- 
rent macroblock. There is no prediction of the 
DC value. 

Input sample values are -255 to +255, 
resulting in a range of about -2000 to +2000 for 
the AC coefficients. 

Quantizing 

Intra blocks are quantized in the same 
manner as they are for I pictures. 

Non-intra blocks are quantized using the 
quantizer scale and the non-intra quantization 
matrix. The AC and DC coefficients are quan- 
tized in the same manner. 

Coding of Intra Blocks 

Intra blocks are coded the same way as I 
picture intra blocks. There is a difference in 
the handling of the DC coefficients in that the 
predicted value is 128, unless the previous 
block was intra coded. 

Coding of Non-intra Blocks 

The coded block pattern (CBP) is used to 
specify which blocks have coefficient data. 
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These are coded similarly to the coding of intra 
blocks, except the DC coefficient is coded in 
the same manner as the AC coefficients. 

B Pictures 

Macroblocks 

There are 34 types of macroblocks in B 
pictures, as shown in Table 13.29, due to the 
additional complexity of backward motion 
compensation. 

For B field pictures, the decoder predicts 
from the field of the same parity as the field 
being predicted. The direction of prediction 
(forward, backward, or bi-directional) is the 
same as the previous macroblock, motion vec- 
tor predictors are unaffected, and the motion 
vectors are taken from the appropriate motion 
vector predictors. For B frame pictures, the 
direction of prediction (forward, backward, or 
bi-directional) is the same as the previous mac- 
roblock, motion vector predictors are unaf- 
fected, and the motion vectors are taken from 
the appropriate motion vector predictors. 

If the [macroblock quant] column in Table 

13.29 has a “1,” the quantizer scale is transmit- 
ted. For the rest of the macroblock types, the 
DCT correction is coded using the previous 
value for the quantizer scale. 

If the [motion forward] column in Table 

13.29 has a “1,” horizontal and vertical forward 
motion vectors are successively transmitted. If 
the [motion backward] column in Table 13.29 
has a “1,” horizontal and vertical backward 
motion vectors are successively transmitted. If 
both forward and backward motion types are 
present, the vectors are transmitted in this 
order: 

horizontal forward 
vertical forward 
horizontal backward 
vertical backward 



If the [coded pattern] column in Table 

13.29 has a “1,” the 6-bit coded block pattern is 
transmitted as a variable-length code. This tells 
the decoder which of the six blocks in the mac- 
roblock are coded (“1”) and which are not 
coded (“0”). Table 13.32 lists the codewords 
assigned to the 63 possible combinations. 
There is no code for when none of the blocks is 
coded; this is indicated by the macroblock 
type. For intra-coded macroblocks in P and B 
pictures, the coded block pattern is not trans- 
mitted, but is assumed to be a value of 63 (all 
blocks are coded). For 4:2:2 and 4:4:4 macro- 
blocks, an additional two or six bits respec- 
tively are used to extend the coded block 
pattern. 

Coding 

DCT coefficients of blocks are trans- 
formed into quantized coefficients and coded 
in the same way as they are for P pictures. 



Video Bitstream 

Figure 13.5 illustrates the video bitstream, 
a hierarchical structure with seven layers. 
From top to bottom the layers are: 

Video Sequence 
Sequence Header 
Group of Pictures (GOP) 

Picture 

Slice 

Macroblock (MB) 

Block 

Several extensions may be used to support 
various levels of capability. These extensions 
are: 

Sequence Extension 

Sequence Display Extension 

Sequence Scalable Extension 
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Figure 13.5. MPEG-2 Video Bitstream Layer Structures. Marker and reserved bits not shown 
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Picture Coding Extension 
Quant Matrix Extension 
Picture Display Extension 
Picture Temporal Scalable Extension 
Picture Spatial Scalable Extension 

If the first sequence header of a video 
sequence is not followed by an extension start 
code (0x000001B5), then the video bitstream 
must conform to the MPEG-1 video bitstream. 

For MPEG-2 video bitstreams, an exten- 
sion start code (0x000001B5) and a sequence 
extension must follow each sequence header. 

Note that start codes (OxOOOOOlxx) must 
be byte aligned by inserting 0-7 “0” bits before 
the start code. 

Video Sequence 

Sequence_end_code 

This 32-bit field has a value of 0x000001B7 
and terminates a video sequence. 

Sequence Header 

A sequence header should occur about 
every one-half second. The structure is shown 
in Figure 13.5. If not followed by a sequence 
extension, the bitstream conforms to MPEG-1. 

Sequence_header_code 

This 32-bit string has a value of 
0x000001B3 and indicates the beginning of a 
sequence header. 

Horizontal_size_value 

This is the twelve least significant bits of 
the width (in samples) of the viewable portion 
of the Y component. The two most significant 
bits of the 14-bit value are specified in the 



horizontal _size _extension. A value of zero is not 
allowed. 

Vertical_size_value 

This is the twelve least significant bits of 
the height (in scan lines) of the viewable por- 
tion of the Y component. The two most signifi- 
cant bits of the 14-bit value are specified in the 
vertical_size_extension. A value of zero is not 
allowed. 

Aspect_ratio_information 

This 4-bit codeword indicates either the 
sample aspect ratio (SAR) or display aspect 
ratio (DAR) as shown in Table 13.9. 

If sequence -display -extension is not 

present, the SAR is determined as follows: 

SAR = DAR x (horizontal_size/vertical_size) 

If sequence -display -extension is present, 
the SAR is determined as follows: 

SAR = DAR x (display_horizontal_size/ 
display_vertical_size) 

Frame_rate_code 

This 4-bit codeword indicates the frame 
rate, as shown in Table 13.10. 

The actual frame rate is determined as fol- 
lows: 

frame_rate = frame_rate_value x 

(frame_rate_extension_n + 1 )/ 
(frame_rate_extension_d + 1 ) 

When an entry is specified in Table 13.10, 
both frame_rate_extension_n and 

frame _rate -extension _d are “00.” If 

progressive sequence is “1,” the time between 
two frames at the output of the decoder is the 
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SAR 


DAR 


Code 


forbidden 


forbidden 


0000 


1.0000 


- 


0001 


- 


3/4 


0010 


- 


9/16 


0011 


- 


1/2.21 


0100 


- 


reserved 


0101 


- 


0110 


- 


0111 


- 


1000 


- 


1001 


- 


1010 


- 


1011 


- 


1100 


- 


1101 


- 


1110 


- 


1111 



Table 13.9. MPEG-2 aspect_ratio_information Codewords. 



Frames 
per Second 


Code 


forbidden 


0000 


24/1.001 


0001 


24 


0010 


25 


0011 


30/1.001 


0100 


30 


0101 


50 


0110 


60/1.001 


0111 


60 


1000 


reserved 


1001 


reserved 


1010 


reserved 


1011 


reserved 


1100 


reserved 


1101 


reserved 


1110 


reserved 


1111 



Table 13.10. MPEG-2 frame rate code Codewords. 
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reciprocal of the frame_rate. If 
progressive_sequence is “0,” the time between 
two frames at the output of the decoder is one- 
half of the reciprocal of the frame_rate. 

Bit_rate_value 

The 18 least significant bits of a 30-bit 
binary number. The 12 most significant bits 
are in the bit_rate_extension. This specifies the 
bitstream bit-rate, measured in units of 400 
bps, rounded upwards. A zero value is not 
allowed. For the ATSC standard, the value 
must be <48500], (<97000]-) for high data rate 
mode). For the OpenCable standard, the 
value must be <67500]) for 64QAM systems 
(<97000]) for 256QAM systems). 

Marker_bit 

Always a “1.” 

Vbv_buffer_size_value 

The 10 least significant bits of an 18-bit 
binary number. The 8 most significant bits are 
in the vbv_buffer_size_extension. Defines the 
size of the Video Buffering Verifier needed to 
decode the sequence. It is defined as: 

B = 16 x 1024 x vbv_buffer_size 

For the ATSC and OpenCable™ standards, the 
value must be <488j> 

Constrained_parameters_flag 

This bit is set to a “0” since it has no mean- 
ing for MPEG-2. 

Load_intra_quantizer_matrix 

This bit is set to a “1” if an 
intra _quantizer _matrix follows. If set to a “0,” 
the default values below are used for infra 
blocks (both Y and CbCr) until the next occur- 
rence of a sequence header or 
quant _matrix -extension. 



8 


16 


19 


22 


26 


27 


29 


34 


16 


16 


22 


24 


27 


29 


34 


37 


19 


22 


26 


27 


29 


34 


34 


38 


22 


22 


26 


27 


29 


34 


37 


40 


22 


26 


27 


29 


32 


35 


40 


48 


26 


27 


29 


32 


35 


40 


48 


58 


26 


27 


29 


34 


38 


46 


56 


69 


27 


29 


35 


38 


46 


56 


69 


83 



Intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the current values. A value of zero is 
not allowed. The value for intra_quant [0, 0] is 
always 8. These values take effect until the 
next occurrence of a sequence header or 
quant -Matrix -extension. For 4:2:2 and 4:4:4 
data formats, the new values are used for both 
the Y and CbCr infra matrix, unless a different 
CbCr infra matrix is loaded. 

Load_non_intra_quantizer_matrix 

This bit is set to a “1” if a 
non _intra -quantizer -Matrix follows. If set to a 
“0,” the default values below are used for non- 
intra blocks (both Y and CbCr) until the next 
occurrence of a sequence header or 
quan t -Matrix -extension . 



16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 



N on_intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the current values. A value of zero is 
not allowed. These values take effect until the 
next occurrence of a sequence header or 
quant -Matrix -extension. For 4:2:2 and 4:4:4 
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data formats, the new values are used for both 
Y and CbCr non-intra matrix, unless a different 
CbCr non-intra matrix is loaded. 

User Data 

U ser_data_start_code 

This optional 32-bit string of 0x00000 1B2 
indicates the beginning of userjdata. userjdata 
continues until the detection of another start 
code. 

User_data 

These n x 8 bits are present only if 
user_data_start_code is present, userjdata 
must not contain a string of 23 or more consec- 
utive zero bits. 

Sequence Extension 

A sequence extension may only occur after 
a sequence header. 

Extension_start_code 

This 32-bit string of 0x000001B5 indicates 
the beginning of extension data beyond 
MPEG-1. 

Extension_start_code_ID 

This Tbit field has a value of “0001” and 
indicates the beginning of a sequence exten- 
sion. For MPEG-2 video bitstreams, a 
sequence extension must follow each 
sequence header. 

Profile_and_level_in dication 

This 8-bit field specifies the profile and 
level, as shown in Table 13.11. 

Bit 7: escape bit 
Bits 6-4: profile ID 
Bits 3-0: level ID 



Progressive_sequence 

A “1” for this bit indicates only progressive 
pictures are present. A “0” indicates both 
frame and field pictures may be present, and 
frame pictures may be progressive or inter- 
laced. For the SVCD standard, this value must 
be “0.” 

Chroma_format 

This 2-bit codeword indicates the CbCr for- 
mat, as shown in Table 13.12. For the ATSC 
and OpenCable standards, the value must be 
“ 01 .” 

Horizontal_size_extension 

The two most significant bits of 

horizontal _size. For the ATSC and OpenCable 
standards, the value must be “00.” 

Vertical_size_extension 

The two most significant bits of 

vertical_size. For the ATSC and OpenCable 
standards, the value must be “00.” 

Bit_rate_extension 

The twelve most significant bits of bitjrate. 
For the ATSC and OpenCable standards, the 
value must be “0000 0000 0000.” 

Marker_bit 

Always a “1.” 

vbv_buffer_size_extension 

The eight most significant bits of 
vbv_buffer_size. For the ATSC and OpenCable 
standards, the value must be “0000 0000.” 

Low_delay 

A “1” for this bit indicates that no B pic- 
tures are present, so no frame reordering 
delay. For the SVCD standard, this value must 
be “0.” 




Video Bitstream 597 



Profile 


Profile ID Code 


Level 


Level ID Code 


reserved 


000 


reserved 


0000 


high 


001 


reserved 


0001 


spatial scalable 


010 


reserved 


0010 


SNR scalable 


Oil 


reserved 


0011 


main 


100 


high 


0100 


simple 


101 


reserved 


0101 


reserved 


110 


high 1440 


0110 


reserved 


111 


reserved 


0111 






main 


1000 






reserved 


1001 






low 


1010 






reserved 


1011 






reserved 


1100 






reserved 


1101 






reserved 


1110 






reserved 


1111 



Table 13.11. MPEG-2 profile _and_level -indication Codewords. 



Chroma Format 


Code 


reserved 


00 


4:2:0 


01 


4:2:2 


10 


4:4:4 


11 



Table 13.12. MPEG-2 chroma_format Codewords. 
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Figure 13.6. MPEG-2 Sequence Extension Structure. Marker bits not shown. 



Frame_rate_extension_n 

See frame_rate_code regarding this 2-bit 
binary value. For the ATSC and OpenCable 
standards, the value must be “00.” 

Frame_rate_extension_d 

See frame_rate_code regarding this 5-bit 
binary value. For the ATSC and OpenCable 
standards, the value must be “00000.” 

Sequence Display Extension 

This optional extension may only occur 
after a sequence extension. 

Extension_start_code 

This 32-bit string of 0x000001B5 indicates 
the beginning of a new set of extension data. 

Extension_start_code_ID 

This Tbit field has a value of “0010” and 
indicates the beginning of a sequence display 
extension. Information provided by this exten- 
sion does not affect the decoding process and 
may be ignored. It allows the display of the 
decoded pictures to be as accurate as possible. 



Video_format 

This 3-bit codeword indicates the source of 
the pictures prior to MPEG encoding, as 
shown in Table 13.13. For the ATSC and Open- 
Cable standards, the value must be “000.” 

Color_description 

A “1” for this bit indicates that 
color_primaries, transfer -characteristics, and 
matrix -Coefficients are present in the bit- 
stream. 

Color_primaries 

This optional 8-bit codeword describes the 
chromaticity coordinates of the source prima- 
ries, as shown in Table 13.14. If 
sequence -display -extension is not present, or 
color -description = “0,” the indicated default 
value must be used. 

This information may be used to adjust the 
color processing after MPEG-2 decoding to 
compensate for the color primaries of the dis- 
play. 
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Video 

Format 


Code 


component 


000 


PAL 


001 


NTSC 


010 


SECAM 


Oil 


MAC 


100 


unspecified 


101 


reserved 


110 


reserved 


111 



Table 13.13. MPEG-2 video Jormat Codewords. 



Color Primaries 


Code 


Application Default 


forbidden 


0000 0000 




BT.709, SMPTE 274M, 
BT.1361, IEC 61966-2-4 


0000 0001 


HDTV 


unspecified 


0000 0010 




reserved 


0000 0011 




BT.470 system M 


0000 0100 


30 Hz SDTV 


BT.470 system B, G 


0000 0101 


25 Hz SDTV 


SMPTE 170M 


0000 0110 


30 Hz SDTV 


SMPTE 240M 


0000 0111 




reserved 


0000 1000 










reserved 


1111 1111 





Table 13.14. MPEG-2 color _primaries Codewords 
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Opto-Electronic 
Transfer Characteristics 


Code 


Application Default 


forbidden 


0000 0000 




BT.709, SMPTE 274M, 
BT.1361 


0000 0001 


HDTV 


unspecified 


0000 0010 




reserved 


0000 0011 




BT.470 system M 


0000 0100 


30 Hz SDTV 


BT.470 system B, G 


0000 0101 


25 Hz SDTV 


SMPTE 170M 


0000 0110 


30 Hz SDTV 


SMPTE 240M 


0000 0111 




linear 


0000 1000 




logarithmic (100:1 range) 


0000 1001 




logarithmic (316:1 range) 


0000 1010 




IEC 61966-2-4 


0000 1011 




BT.1361 


0000 1100 




reserved 


0000 1101 










reserved 


1111 1111 





Table 13.15. MPEG-2 transfer_characteristics Codewords. 



Matrix Coefficients 


Code 


Application Default 


forbidden 


0000 0000 




BT.709, SMPTE 274M, 
BT.1361, 

IEC 61966-2-4 (xvYCC 7 o 9 ) 


0000 0001 


HDTV 


unspecified 


0000 0010 




reserved 


0000 0011 




FCC 


0000 0100 


30 Hz SDTV 


BT.470 system B, G, I, 

IEC 61966-2-4 (xvYCC 60 i) 


0000 0101 


25 Hz SDTV 


SMPTE 170M 


0000 0110 


30 Hz SDTV 


SMPTE 240M 


0000 0111 




YCgCo 


0000 1000 




reserved 


0000 1001 










reserved 


1111 1111 





Table 13.16. MPEG-2 matrix_coefficients Codewords. 
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EXTENSION 

START 

CODE 


EXTENSION 
START 
CODE ID 


VIDEO 

FORMAT 


COLOR 

DESCRIPTION 



COLOR 

PRIMARIES 



TRANSFER 

CHARACTERISTICS 



MATRIX 

COEFFICIENTS 



DISPLAY 


DISPLAY 


HORIZONTAL 


VERTICAL 


SIZE 


SIZE 



Figure 13.7. MPEG-2 Sequence Display Extension Structure. Marker bits not shown. 



Transfer_characteristics 

This optional 8-bit codeword describes the 
optoelectronic transfer characteristic of the 
source picture, as shown in Table 13.15. If 
sequence_display_extension is not present, or 
color -description = “0,” the indicated default 
value must be used. 

This information may be used to adjust the 
processing after MPEG-2 decoding to compen- 
sate for the gamma of the display. 

Matrix_coefficients 

This optional 8-bit codeword describes the 
coefficients used in deriving YCbCr from 
RGB', as shown in Table 13.16. If 
seq u en ce -display _exdensio n is not present, or 
color -description = “0,” the indicated default 
value must be used. 

This information is used to select the 
proper YCbCr-to-RGB matrix, if needed, after 
MPEG-2 decoding. 

Display_horizontal_size 

See display _vertical_size regarding this 14- 
bit binary number. 

Marker_bit 

Always a “1.” 



Display_vertical_size 

This 14-bit binary number, in conjunction 
with display -horizontal _size, defines the active 
region of the display. If the display region is 
smaller than the encoded picture size, only a 
portion of the picture will be displayed. If the 
display region is larger than the picture size, 
the picture will be displayed on a portion of the 
display. 

Sequence Scalable Extension 

This optional extension may only occur 
after a sequence extension. 

Extension_start_code 

This 32-bit string of 0x00000 1B5 indicates 
the beginning of a new set of extension data. 

Extension_start_code_ID 

This 4-bit field has a value of “0101” and 
indicates the beginning of a sequence scalable 
extension. This extension specifies the seal- 
ability modes implemented for the video bit- 
stream. If sequence scalable -extension is not 
present in the bitstream, no scalability is used. 
The base layer of a scalable hierarchy does not 
have a sequence_scalable_extension, except in 
the case of data partitioning. 
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EXTENSION 


EXTENSION 


SCALABLE 

MODE 


LAYER 

ID 


LOWER LAYER 


LOWER LAYER 


HORIZONTAL 


HORIZONTAL 


START 


START 


PREDICTION 


PREDICTION 


SUBSAMPLING 


SUBSAMPLING 


CODE 


CODE ID 


H SIZE 


V SIZE 


FACTOR M 


FACTOR N 



VERTICAL 


VERTICAL 


PICTURE 


MUX TO 


PICTURE 


PICTURE 


■ SUBSAMPLING 


SUBSAMPLING 


MUX 


PROGRESSIVE 


MUX 


MUX 


FACTOR M 


FACTOR N 


ENABLE 


SEQUENCE 


ORDER 


FACTOR 



Figure 13.8. MPEG-2 Sequence Scalable Extension Structure. Marker bits not shown. 



Scalable_mode 

This 2-bit codeword indicates the scalabil- 
ity type of the video sequence as shown in 
Table 13.17. 



Scalable Mode 


Code 


data partitioning 


00 


spatial scalability 


01 


SNR scalability 


10 


temporal scalability 


11 



Table 13.17. MPEG-2 scalable _mode 
Codewords. 



Layer_ID 

This Tbit binary number identifies the lay- 
ers in a scalable hierarchy. The base layer has 
an ID of “0000.” During data partitioning, 
layer_ID “0000” is assigned to partition layer 
zero and layer_ID “0001” is assigned to parti- 
tion layer one. 



Lower_layer_prediction_horizontal_size 

This optional 14-bit binary number is 
present only if scalable_mode = “01.” It indi- 
cates the horizontal size of the lower layer 
frame used for prediction. It contains the value 
of horizontal _size in the lower layer bitstream. 

Marker_bit 

Always a “1.” It is present only if 
scalable _mode = “01.” 

Lower_layer_prediction_vertical_size 

This optional 14-bit binary number is 
present only if scalable_mode = “01.” It indi- 
cates the vertical size of the lower layer frame 
used for prediction. It contains the value of 
vertical_size in the lower layer bitstream. 

Horizontal_subsampling_factor_m 

This optional 5-bit binary number is 
present only if scalablejnode = “01,” and 
affects the spatial upsampling process. A value 
of “00000” is not allowed. 

Horizontal_subsampling_factor_n 

This optional 5-bit binary number is 
present only if scalable_mode = “01,” and 
affects the spatial upsampling process. A value 
of “00000” is not allowed. 
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Vertical_subsampling_factor_m 

This optional 5-bit binary number is 
present only if scalablejmode = “01,” and 
affects the spatial upsampling process. A value 
of “00000” is not allowed. 

Vertical_subsampling_factor_n 

This optional 5-bit binary number is 
present only if scalable_mode = “01,” and 
affects the spatial upsampling process. A value 
of “00000” is not allowed. 



Picture_mux_enable 

This optional l-bit field is present only if 
scalablejmode = “11.” If set to a “1,” the 
picture _mux_order and picture_mux Jactor 
parameters are used for remultiplexing prior to 
display. 

Mux_to_progressive_sequence 

This optional 1-bit field is present only if 
scalablejmode = “11” and picture jmux_enable = 
“1.” If set to a “1,” it indicates the decoded pic- 
tures are to be temporally multiplexed to gen- 
erate a progressive sequence for display. When 
temporal multiplexing is to generate an inter- 
laced sequence, this flag is a “0.” 

Picture_mux_order 

This optional 3-bit binary number is 
present only if scalablejmode = “11.” It specifies 
the number of enhancement layer pictures 
prior to the first base layer picture. It is used to 
assist the decoder in properly remultiplexing 
pictures prior to display. 



Picture_mux_factor 

This optional 3-bit binary number is 
present only \iscalable_mode = “11.” It denotes 
the number of enhancement layer pictures 
between consecutive base layer pictures, and 
is used to assist the decoder in properly remul- 
tiplexing pictures prior to display. 

Group of Pictures (GOP) Layer 

A GOP header should occur about every 
two seconds. Data for each group of pictures 
consists of a GOP header followed by picture 
data. The structure is shown in Figure 13.5. 
The DVD standard uses user data extensions 
at this layer for closed captioning data. 

Group_start_code 

This 32-bit string has a value of 
0x000001B8 and indicates the beginning of a 
group of pictures. 

Time_code 

These 25 bits indicate timecode informa- 
tion, as shown in Table 13.18. Drop _Jrame Jlag 
may be set to “1” only if the frame rate is 30/ 
1.001 (29.97) Hz. 

Closed_gop 

This 1-bit flag is set to “1” if the group of 
pictures has been encoded without motion vec- 
tors referencing the previous group of pic- 
tures. This bit allows support of editing the 
compressed bitstream. 

Brokenlink 

This 1-bit flag is set to a “0” during encod- 
ing. It is set to a “1” during editing when the B 
frames following the first I frame of a group of 
pictures cannot be correctly decoded. 
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Picture Layer 

Data for each picture consists of a picture 
header followed by slice data. The structure is 
shown in Figure 13.5. If a sequence extension 
is present, each picture header is followed by a 
picture coding extension. 

Some implementations enable frame-accu- 
rate switching of aspect ratio information via 
user data extensions at this layer. The ATSC 
standard also uses user data extensions at this 
layer for CEA-708 closed captioning data. 

Picture_start_code 

This 32-bit string has a value of 
0x00000100. 

Temporal_reference 

For the first frame in a GOP, the 10-bit 
binary number temporal_reference is zero. It 
then increments by one, modulo 1024, for each 
frame in the display order. When a frame is 
coded as two fields, the temporal reference of 
both fields is the same. 



Picture_coding_type 

This 3-bit codeword indicates the picture 
type (I picture, P picture, or B picture) as 
shown in Table 13.19. 



Picture Type 


Code 


forbidden 


000 


I picture 


001 


P picture 


010 


B picture 


Oil 


forbidden 


100 


reserved 


101 


reserved 


110 


reserved 


111 



Table 13.19. MPEG-2 picture _coding_type 
Codewords. 



Timecode 


Range 
of Vaiue 


Number of 
Bits 


drop frame flag 




1 


time_code_hours 


0-23 


5 


time code minutes 


0-59 


6 


marker_bit 


1 


1 


time code seconds 


0-59 


6 


time_code_pictures 


0-59 


6 



Table 13.18. MPEG-2 time_code Field. 
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Vbv_delay 

For constant bit-rates, this 16-bit binary 
number sets the initial occupancy of the decod- 
ing buffer at the start of decoding a picture so 
that it doesn’t overflow or underflow. For the 
ATSC and OpenCable standards, unless 
vbv_delay has the value OxFFFF, the value 
must be <45000]). 



Full_pel_forward_vector 

This optional 1-bit field is not used for 
MPEG-2, so has a value of “0.” It is present 
only \l picture jcoding_type = “010” or “Oil.” 

Forward_f_code 

This optional 3-bit field is not used for 
MPEG-2, so has a value of “111.” It is present 
only \{ picture jcoding_type = “010” or “Oil.” 



Full_pel_backward_vector 

This optional 1-bit field is not used for 
MPEG-2, so has a value of “0.” It is present 
only if picture _coding_type = “Oil.” 

Backward_f_code 

This optional 3-bit field is not used for 
MPEG-2, so has a value of “111.” It is present 
only \l picture jcoding_type = “Oil.” 



Extra_bit_picture 

A bit which, when set to “1,” indicates that 
contentjiescriptionjdata follows. A “0” indi- 
cates that no content _description_data follows. 

Content_description_data 

If extra_bit _picture = “1,” then this optional 
variable-length field is present, with every 
ninth bit having the value of “1.” 



Extra_bit_picture 

This optional bit has a value of “0” and is 
present only if contentjiescriptionjdata is 
present. 

Content Description Data 

This optional data is only present when 
indicated by extra Jit Jicture in the picture 
header. 

Data_type_upper 

This 8-bit field contains the eight most sig- 
nificant bits of the 16-bit binary datajype that 
defines the type of content description data, as 
shown in Table 13.20. 

Marker_bit 

Always a “1.” 

Data_type_lower 

This 8-bit field contains the eight least sig- 
nificant bits of the 16-bit binary datajype that 
defines the type of content description data, as 
shown in Table 13.20. 



Data Type 


Code 


reserved 


0000 0000 0000 0000 


padding bytes 


0000 0000 0000 0001 


capture timecode 


0000 0000 0000 0010 


pan-scan parameters 


0000 0000 0000 0011 


active region window 


0000 0000 0000 0100 


coded picture length 


0000 0000 0000 0101 


reserved 


0000 0000 0000 0110 






reserved 


1111 1111 1111 1111 



Table 13.20. MPEG-2 datajype Codewords. 
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Marker_bit 

Always a “1.” 

Datalength 

This 8-bit binary number specifies the 
remaining amount of data that follows, 
expressed in units of 9 bits. 



Note: The following fields are present when “padding 
bytes” is indicated by datajype. The two fields are 
repeated for the number of times indicated by the 
datajength field. 

Marker_bit 

Always a “1.” 

Padding_byte 

This 8-bit field has the value of “0000 
0000.” All other values are forbidden. 



Note: The following fields are present when “capture 
timecode” is indicated by datajype. It contains times- 
tamps that indicate the source capture or creation time 
of the fields or frames. It does not take precedence over 
any timecode present at the system level. 

Marker_bit 

Always a “1.” 

Timecode_type 

This 2-bit codeword indicates the number 
of timecodes associated with the picture, as 
shown in Table 13.21. 



Timecode Type 


Code 


one timecode for the frame 


00 


one timecode for the first or 
only field 


01 


one timecode for the second 
field 


10 


two timecodes, one for each 
of two fields 


11 



Table 13.21. MPEG-2 timecode_type 
Codewords. 



Counting_type 

This optional 3-bit codeword specifies the 
method used for compensating the n fr ames 
counting parameter to reduce drift accumula- 
tion. 

Reserved_bit 

Always a “0.” 

Reserved_bit 

Always a “0.” 

Reserved_bit 

Always a “0.” 



Marker_bit 

This optional bit is always a “1.” This field 
is present only when countingjype J “000.” 

Nframes_conversion_code 

This optional bit specifies the conversion 
factor (1000 + nframes jconversionjcode ) in 
determining the amount of time indicated by 
the nframes parameter. This field is present 
only when countingjype J “000.” 
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Clock_divisor 

This optional 7-bit binary number specifies 
the number of divisions of the 27 MHz system 
clock to be applied for generating the equiva- 
lent timestamp. This field is present only when 
countingjype ^ “000.” 

Marker_bit 

This optional bit is always a “1.” This field 
is present only when countingjype V “000.” 

Nframes_multiplier_upper 

This optional 8-bit value is the 8 most sig- 
nificant bits of the 16-bit nframesjmultiplier 
value. This field is present only when 

countingjype ^ “000.” 

Marker_bit 

This optional bit is always a “1.” This field 
is present only when countingjype V “000.” 

Nframes_multiplier_lower 

This optional 8-bit value is the 8 least sig- 
nificant bits of the 16-bit nframesjmultiplier 
value. This field is present only when 

countingjype J “000.” 



“Field or frame capture timestamp” information follows. 

Marker_bit 

This optional bit is always a “1.” This field 
is present only when countingjype J “000.” 

Nframes 

This optional 8-bit binary number specifies 
the number of frame time increments to add in 
deriving the equivalent timestamp. This field is 
present only when countingjype J “000.” 



Marker_bit 

Always a “1.” 

Time_discontinuity 

A “1” for this l-bit flag indicates that a dis- 
continuity in the timecode sequence has 
occurred. 

Prior_count_dropped 

This 1-bit flag indicates if the counting of 
one or more values of nframes was dropped. 

Time_offset_part_a 

A 6-bit value containing the 6 most signifi- 
cant bits of time j)ff set. Timejoffset is a 30-bit 
signed value that specifies the number of clock 
cycles offset from the time specified by other 
timestamp parameters to specify the equiva- 
lent timestamp for when the current field or 
frame was captured. 

Marker_bit 

Always a “1.” 

Time_offset_part_b 

An 8-bit value containing the 8 second 
most significant bits of timejffset. 

Marker_bit 

Always a “1.” 

Time_offset_part_c 

An 8-bit value containing the 8 third most 
significant bits of timejffset. 

Marker_bit 

Always a “1.” 

Time_offset_part_d 

An 8-bit value containing the 8 least signifi- 
cant bits of timejffset. 
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Marker_bit 

Always a “1.” 

Units_of_seconds 

A 4-bit binary number that indicates the 
seconds timestamp value. It may have a value 
of “0000” to “1001.” 

Tens_of_seconds 

A 4-bit binary number that indicates the 
tens of seconds timestamp value. It may have a 
value of “0000” to “0101.” 

Marker_bit 

Always a “1.” 

Units_of_minutes 

A 4-bit binary number that indicates the 
minutes timestamp value. It may have a value 
of “0000” to “1001.” 

Tens_of_minutes 

A 4-bit binary number that indicates the 
tens of minutes timestamp value. It may have a 
value of “0000” to “0101.” 

Marker_bit 

Always a “1.” 

Units_of_hours 

A 4-bit binary number that indicates the 
hours timestamp value. It may have a value of 
“0000” to “1001.” 

Tens_of_hours 

A 4-bit binary number that indicates the 
tens of hours timestamp value. It may have a 
value of “0000” to “0010.” 



When timecodejype = “11”, the “field or frame capture 
timestamp” fields are again present to convey the infor- 
mation for the second field. 



Note: The following fields are present when “pan-scan 
parameters” is indicated by datajype. This allows the 
transmission of additional pan-scan information for a 
display that has a different aspect ratio. 

Marker_bit 

Always a “1.” 

Aspect_ratio_information 

This 4-bit codeword is the same as used by 
the sequence header. 

Reserved_bit 

Always a “0.” 

Reserved_bit 

Always a “0.” 

Reserved_bit 

Always a “0.” 

Display_size_present 

A l-bit flag that indicates whether or not 
the display -horizontal _size and 

display _vertical_size fields follow. 



Marker_bit 

Always a “1.” This optional field is present 
only if display _size _ present = “1.” 

Reserved_bit 

Always a “0.” This optional field is present 
only if display _size _ present = “1.” 
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Reserved_bit 

Always a “0.” This optional field is present 
only if display _size _present = “1.” 

Display_horizontal_size_upper 

These are the 6 most significant bits of 
display _horizontal_size. This optional field is 
present only if display jsize _present = “1.” 

Marker_bit 

Always a “1.” This optional field is present 
only if display _size _present = “1.” 

Display_horizontal_size_lower 

These are the 8 least significant bits of 
display _horizontal_size. This optional field is 
present only if display _size present = “1.” 

Marker_bit 

Always a “1.” This optional field is present 
only if display _size _present = “1.” 

Reserved_bit 

Always a “0.” This optional field is present 
only if display _size _present = “1.” 

Reserved_bit 

Always a “0.” This optional field is present 
only if display _size _present = “1.” 

Display_vertical_size_upper 

These are the 6 most significant bits of 
display _vertical_size. This optional field is 
present only if display _size _present = “1.” 

Marker_bit 

Always a “1.” This optional field is present 
only if display _size _present = “1.” 



Display_vertical_size_lower 

These are the 8 least significant bits of 
display _vertical_size. This optional field is 
present only if display _size present = “1.” 



Note: The following fields are present for each of the 
frame center offsets present. 

Marker_bit 

Always a “1.” 

Frame_center_horizontal_offset_upper 

These are the 8 most significant bits of 
frame _center_horizontal_offset. The definition 
of frame _center_horizonal_offset is specified in 
the picture display extension. 

Marker_bit 

Always a “1.” 

Frame_center_horizontal_offset_lower 

These are the 8 least significant bits of 
frame _center_horizontal_offset. 

Marker_bit 

Always a “1.” 

Frame_center_vertical_offset_upper 

These are the 8 most significant bits of 
frame _center_vertical_ofjset. The definition of 
frame_center_horizonal_offset is specified in the 
picture display extension. 

Marker_bit 

Always a “1.” 

Frame_center_vertical_offset_lower 

These are the 8 least significant bits of 
frame _center_vertical_offset. 
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Note: The following fields are present when “active 
region window” is indicated by datajype. The 
active_region_window defines the rectangle the decoded 
picture is intended to be displayed. It must not be larger 
than the region defined by horizontal _size and 
vertical _size. 

Marker_bit 

Always a “1.” 

Top_left_x_upper 

This 8-bit field is the 8 most significant bits 
of the 16-bit topjeftjc. TopJeftjc defines the Y 
sample number that, together with topjeftjy, 
defines the upper left corner of the 
active_region_window rectangle. 

Marker_bit 

Always a “1.” 

Top_left_x_lower 

This 8-bit field is the 8 least significant bits 
of the 16-bit topjeftjx. 

Marker_bit 

Always a “1.” 

Top_left_y_upper 

This 8-bit field is the 8 most significant bits 
of the 16-bit topjeftjy. Top_left_y defines the Y 
line number that, together with topjeftjc, 
defines the upper left corner of the 
active_region_window rectangle. 

Marker_bit 

Always a “1.” 

Top_left_y_lower 

This 8-bit field is the 8 least significant bits 
of the 16-bit topjeftjy. 



Marker_bit 

Always a “1.” 

Active_region_horizontal_size_upper 

This 8-bit field is the 8 most significant bits 
of the 16-bit active, _region_horizontal_size. 
Active, _region_horizontal_size, along with 
active_region_vertical_size, defines the bottom 
right corner of the active jregionjwindow rect- 
angle. A value of 0x0000 for 
active_region_horizontal_size indicates the size 
is unknown. 

Marker_bit 

Always a “1.” 

Active_region_horizontal_size_lower 

This 8-bit field is the 8 least significant bits 
of the 16-bit active _region_horizontal_size. 

Marker_bit 

Always a “1.” 

Active_region_vertical_size_upper 

This 8-bit field is the 8 most significant bits 
of the 16-bit active, _region_vertical_size. 
Active_region_vertical_size, along with 
active_region_horizontal_size, defines the bot- 
tom right corner of the active jregionjwindow 
rectangle. A value of 0x0000 for 
active _region_vertical_size indicates the size is 
unknown. 

Marker_bit 

Always a “1.” 

Active_region_vertical_size_lower 

This 8-bit field is the 8 least significant bits 
of the 16-bit active, _region_vertical_size. 
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Note: The following fields are present when “coded pic- 
ture length” is indicated by data_type. 

Marker_bit 

Always a “1.” 

Picture_byte_count_part_a 

This 8-bit field is the 8 most significant bits 
of the 32-bit picture_bytejcount. 
Picture _byte_count indicates the number of 
bytes starting with the first byte of the first 
slice_start_code of the current picture and end- 
ing with the byte preceding the start code pre- 
fix immediately following the last macroblock 
of the picture. A value of 0x0000 indicates the 
length is unknown. 

Marker_bit 

Always a “1.” 

Picture_byte_count_part_b 

This 8-bit field is the eight second most 
significant bits of the 32-bit picture _bytejcount. 

Marker_bit 

Always a “1.” 

Picture_byte_count_part_c 

This 8-bit field is the eight third most sig- 
nificant bits of the 32-bit picture _byte_count. 

Marker_bit 

Always a “1.” 

Picture_byte_count_part_d 

This 8-bit field is the eight least significant 
bits of the 32-bit picture _byte_count. 



Note: The following two fields are present when no other 
data is present as indicated by datajype. The two fields 
are repeated for the number of times indicated by the 
datajength field. 

Marker_bit 

Always a “1.” 

Reserved_content_description_data 

This 8-bit field is reserved. 

Picture Coding Extension 

A picture coding extension may only occur 
following a picture header. 

Extension_start_code 

This 32-bit string of 0x00000 1B5 indicates 
the beginning of a new set of extension data. 

Extension_start_code_ID 

This Tbit field has a value of “1000” and 
indicates the beginning of a picture coding 
extension. 

fcode [0,0] 

A Tbit binary number, having a range of 
“0001” to “1001,” that is used for the decoding 
of forward horizontal motion vectors. A value 
of “0000” is not allowed; a value of “1111” indi- 
cates this field is ignored. 

fcode [0,1] 

A Tbit binary number, having a range of 
“0001” to “1001,” that is used for the decoding 
of forward vertical motion vectors. A value of 
“0000” is not allowed; a value of “1111” indi- 
cates this parameter is ignored. 
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f_code [1,0] 

A 4-bit binary number, having a range of 
“0001” to “1001,” that is used for the decoding 
of backward horizontal motion vectors. A value 
of “0000” is not allowed; a value of “1111” indi- 
cates this field is ignored. 

f_code [1,1] 

A 4-bit code, having a range of “0001” to 
“1001,” that is used for the decoding of back- 
ward vertical motion vectors. A value of “0000” 
is not allowed; a value of “1111” indicates this 
field is ignored. 

Intra_dc_precision 

This 2-bit codeword specifies the intra DC 
precision as shown in Table 13.22. 

Picture_structure 

This 2-bit codeword specifies the picture 
structure as shown in Table 13.23. 



Intra DC Precision 


Code 


(Bits) 


8 


00 


9 


01 


10 


10 


11 


11 



Table 13.22. MPEG-2 intra dc _precision 
Codewords. 



Picture 

Structure 


Code 


reserved 


00 


top field 


01 


bottom field 


10 


frame picture 


11 



Table 13.23. MPEG-2 picture_structure 
Codewords. 




Figure 13.9. MPEG-2 Picture Coding Extension Structure. Marker bits not shown. 
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Top_field_first 

If progressive_sequence = “0,” this bit indi- 
cates what field is output first by the decoder. 
In a field, this bit has a value of “0.” In a frame, 
a “1” indicates the first field of the decoded 
frame is the top field. A value of “0” indicates 
the first field is the bottom field. 

If progressive _sequence = “1” and 

repeat Jirst Jield = “0,” this bit is a “0” and the 
decoder generates a progressive frame. 

If progressive_sequen.ce = “1,” 

repeat Jirst Jield = “1,” and this bit is a “0,” the 
decoder generates two identical progressive 
frames. 

If progressive_sequence = “1,” 

repeat Jirst Jield = “1,” and this bit is a “1,” the 
decoder generates three identical progressive 
frames. 

Frame_pred_frame_dct 

If this bit is a “1,” only frame-DCT and 
frame prediction are used. For field pictures, it 
is always a “0.” This parameter is a “1” if 
progressive Jrame is “1.” 

Concealment_motion_vectors 

If this bit is a “1,” it indicates that the 
motion vectors are coded for intra macrob- 
locks. 

Q_scale_type 

This bit indicates which of two mappings 
between quantizer _scale_code and 
quantizer_scale are used by the decoder. 

Intra_vlc_format 

This bit indicates which table is to be used 
for DCT coefficients for intra blocks. Table 
13.36 is used when intrajvlc Jormat = “0.” 
Table 13.37 is used when intra_vlc Jormat = 
“1.” For non-intra blocks, Table 13.36 is used 
regardless of the value of intrajvlc Jormat. 



Alternate_scan 

This bit indicates which scanning pattern 
is to be used by the decoder for transform coef- 
ficient data. “0” = Figure 7.59; “1” = Figure 7.60. 

Repeat_first_field 

See top Jield Jirst tor the use of this bit. For 
field pictures, it has a value of “0.” 

Chroma_42 0_type 

If chroma Jormat is 4:2:0, this bit is the 
same as progressive Jrame. Otherwise, it is a 
“ 0 .” 

Progressive_frame 

If a “0,” this bit indicates the two fields of 
the frame are interlaced fields, with a time 
interval between them. If a “1,” the two fields 
of the frame are from the same instant in time. 

Composite_display_flag 

This bit indicates whether or not v_axis, 
field_sequence, subjcarrier, burst _amplitude, 
and subjcarrier _phase are present in the bit- 
stream. 



V_axis 

This bit is present only when 
composite jdisplay Jag = “1.” It is used when 
the original source was a PAL video signal. 
v_axis = “1” on a positive V sign, “0” otherwise. 

This information can be obtained from a 
PAL decoder that is driving the MPEG-2 
encoder. It can be used to enable a MPEG-2 
decoder to set the V switching of a PAL 
encoder to the same as the original. 
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Field_sequence 

This 3-bit codeword is present only when 
composite_displayJlag = “1.” It specifies the 
number of the field in the original four- or 
eight-field sequence as shown in Table 13.24. 

This information can be obtained from a 
NTSC/PAL decoder that is driving the MPEG- 
2 encoder. It can be used to enable an MPEG-2 
decoder to set the field sequence of a NTSC/ 
PAL encoder to the same as the original. 



Frame 

Sequence 


Field 

Sequence 


Code 


1 


1 


000 


1 


2 


001 


2 


3 


010 


2 


4 


Oil 


3 


5 


100 


3 


6 


101 


4 


7 


110 


4 


8 


111 



Table 13.24. MPEG-2 field_sequence 
Codewords. 



Sub_carrier 

This bit is present only when 
composite_displayJlag = “1.” A “0” indicates 
that the original subcarrier-to-line frequency 
relationship was correct. 

This information can be obtained from the 
NTSC/PAL decoder that is driving the MPEG- 
2 encoder. 



Burst_amplitude 

This 7-bit binary number is present only 
when composite jdi&playjlag = “1.” It specifies 
the original PAL or NTSC burst amplitude 
when quantized per BT.601 (ignoring the 
MSB). 

This information can be obtained from a 
NTSC/PAL decoder that is driving the MPEG- 
2 encoder. It can be used to enable an MPEG-2 
decoder to set the color burst amplitude of a 
NTSC/PAL encoder to the same as the origi- 
nal. 

Sub_carrier_phase 

This 8-bit binary number is present only 
when composite jdisplayjlag = “1.” It specifies 
the original PAL or NTSC subcarrier phase as 
defined in BT.470. The value is defined as: 
(360°/256) xsub_carrier _phase. 

This information can be obtained from an 
NTSC/PAL decoder that is driving the MPEG- 
2 encoder. It can be used to enable an MPEG-2 
decoder to set the color subcarrier phase of a 
NTSC/PAL encoder to the same as the origi- 
nal. 

Quant Matrix Extension 

Each quantization matrix has default val- 
ues. When a sequence header is decoded, all 
matrices reset to their default values. User- 
defined matrices may be downloaded during a 
sequence header or using this extension. This 
optional extension may only occur after a pic- 
ture coding extension. 

Extension_start_code 

This 32-bit string of 0x00000 1B5 indicates 
the beginning of a new set of extension data. 
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Extension_start_code_ID 

This Tbit string has a value of “0011” and 
indicates the beginning of a 
quant _matrix_extension. This extension also 
allows quantizer matrices to be transmitted for 
the 4:2:2 and 4:4:4 chroma formats. 

Load_intra_quantizer_matrix 

This bit is set to a “1” if an 
intra_quantizer_matrix follows. If set to a “0,” 
the default values below are used for intra 
blocks until the next occurrence of a sequence 
header or quant _matrix_extension. 



8 


16 


19 


22 


26 


27 


29 


34 


16 


16 


22 


24 


27 


29 


34 


37 


19 


22 


26 


27 


29 


34 


34 


38 


22 


22 


26 


27 


29 


34 


37 


40 


22 


26 


27 


29 


32 


35 


40 


48 


26 


27 


29 


32 


35 


40 


48 


58 


26 


27 


29 


34 


38 


46 


56 


69 


27 


29 


35 


38 


46 


56 


69 


83 



Intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the default values shown above. A 
value of zero is not allowed. The value for 
intra_quant [0, 0] is always 8. These values 
take effect until the next occurrence of a 
sequence header or quant _matrix_extension. 
The order follows that shown in Figure 7.59. 

For 4:2:2 and 4:4:4 data formats, the new 
values are used for both the Y and CbCr intra 
matrix, unless a different CbCr intra matrix is 
loaded. 

Load_non_intra_quantizer_matrix 

This bit is set to a “1” if a 
non_intra_quantizer_matrix follows. If set to a 
“0,” the default values below are used for non- 
intra blocks until the next occurrence of a 
sequence header or quant _matrix_extension. 



16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 


16 



EXTENSION 

START 

CODE 


EXTENSION 
START 
CODE ID 


LOAD INTRA 
QUANTIZER 
MATRIX 


INTRA 

QUANTIZER 

MATRIX 


LOAD 

NON-INTRA 

QUANTIZER 

MATRIX 


NON-INTRA 

QUANTIZER 

MATRIX 


LOAD 

CHROMA INTRA 
QUANTIZER 
MATRIX 


CHROMA INTRA 
QUANTIZER 
MATRIX 



LOAD CHROMA 


CHROMA 


NON-INTRA 


NON-INTRA 


QUANTIZER 


QUANTIZER 


MATRIX 


MATRIX 



Figure 13.10. MPEG-2 Quant Matrix Extension Structure. Marker bits not shown. 
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Non-intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the default values shown above. A 
value of zero is not allowed. These values take 
effect until the next occurrence of a sequence 
header or quant _matrix_extension. The order 
follows that shown in Figure 7.59. 

For 4:2:2 and 4:4:4 data formats, the new 
values are used for both the Y and CbCr non- 
intra matrix, unless a new CbCr non-intra 
matrix is loaded. 

Load_chroma_intra_quantizer_matrix 

This bit is set to a “1” if a 
chroma _intra_quantizer_matrix follows. If set 
to a “0,” there is no change in the values used. 
If chroma _format is 4:2:0, this bit is a “0.” 

Chroma_intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the previous or default values used for 
CbCr data. A value of zero is not allowed. The 
value for chroma _intra_quant [0,0] is always 8. 
These values take effect until the next occur- 
rence of a sequence header or 
quant _matrix_extension. The order follows that 
shown in Figure 7.59. 



Load_chroma_non_intra_quantizer_matrix 

This bit is set to a “1” if a 
chroma _non_intra_quantizer_matrix follows. If 
set to a “0,” there is no change in the values 
used. If chroma Jormat is 4:2:0, this bit is a “0.” 

Chroma_non_intra_quantizer_matrix 

An optional list of 64 8-bit values that 
replace the previous or default values used for 
CbCr data. A value of zero is not allowed. 
These values take effect until the next occur- 
rence of a sequence header or 
quant _matrix_extension. The order follows that 
shown in Figure 7.59. 

Picture Display Extension 

This extension allows the position of the 
display rectangle to be moved on a picture-by- 
picture basis. A typical application would be 
implementing pan-and-scan. This optional 
extension may only occur after a picture cod- 
ing extension. 

Extension_start_code 

This 32-bit string of 0x00000 1B5 indicates 
the beginning of a new set of extension data. 



EXTENSION 


EXTENSION 


FRAME CENTER 


FRAME CENTER 


START 


START 


HORIZONTAL 


VERTICAL 


CODE 


CODE ID 


OFFSET 


OFFSET 



Figure 13.11. MPEG-2 Picture Display Extension Structure. Marker bits not shown. 
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EXTENSION 


EXTENSION 


REFERENCE 


FORWARD 


BACKWARD 


START 


START 


SELECT 


TEMPORAL 


TEMPORAL 


CODE 


CODE ID 


CODE 


REFERENCE 


REFERENCE 



Figure 13.12. MPEG-2 Picture Temporal Scalable Extension Structure. Marker bits not shown. 



Extension_start_code_ID 

This Tbit field has a value of “0111” and 
indicates the beginning of a picture display 
extension. 

In the case of an interlaced sequence, a pic- 
ture may relate to one, two, or three decoded 
fields. Thus, there may be up to three sets of 
the following four fields present in the bit- 
stream. 

Frame_center_horizontal_offset 

This 16-bit 2’s complement number speci- 
fies the horizontal offset in units of l/16th of a 
sample. A positive value positions the center of 
the decoded picture to the right of the center 
of the display region. 

Marker_bit 

Always a “1.” 

Frame_center_vertical_offset 

This 16-bit 2’s complement number speci- 
fies the vertical offset in units of l/16th of a 
scan line. A positive value positions the center 
of the decoded picture below the center of the 
display region. 

Marker_bit 

Always a “1.” 



Picture Temporal Scalable Extension 

This optional extension may only occur 
after a picture coding extension. 

Extension_start_code 

This 32-bit string of 0x00000 1B5 indicates 
the beginning of a new set of extension data. 

Extension_start_code_ID 

This Tbit value of “1010” indicates the 
beginning of a picture temporal scalable exten- 
sion. 

Reference_select_code 

This 2-bit codeword identifies reference 
frames or fields for prediction. 

Forward_temporal_reference 

This 10-bit binary number indicates the 
temporal reference of the lower layer to be 
used to provide the forward prediction. If more 
than 10 bits are required to specify the tempo- 
ral reference, only the 10 LSBs are used. 

Markerjnt 

Always a “1.” 
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Lower Layer 
Deinterlaced 
Field Select 


Lower Layer 
Progressive 
Frame 


Progressive 

Frame 


Apply 

Deinterlace 

Process 


Use For 
Prediction 


0 


0 


1 


yes 


top field 


1 


0 


1 


yes 


bottom field 


1 


1 


1 


no 


frame 


1 


1 


0 


no 


frame 


1 


0 


0 


yes 


both fields 



Table 13.25. MPEG-2 Picture Spatial Scalable Extension Upsampling Process. 



Backward_temporal_reference 

This 10-bit binary number indicates the 
temporal reference of the lower layer to be 
used to provide the backward prediction. If 
more than 10 bits are required to specify the 
temporal reference, only the 10 LSBs are used. 

Picture Spatial Scalable Extension 

This optional extension may only occur 
after a picture coding extension. 

Extension_start_code 

This 32-bit string of 0x000001B5 indicates 
the beginning of a new set of extension data. 



Extension_start_code_ID 

This Tbit value of “1001” indicates the 
beginning of a picture spatial scalable exten- 
sion. 

Lower_layer_temporal_reference 

This 10-bit binary number indicates the 
temporal reference of the lower layer to be 
used to provide the prediction. If more than 10 
bits are required to specify the temporal refer- 
ence, only the 10 LSBs are used. 

Markerjnt 

Always a “1.” 




Figure 13.13. MPEG-2 Picture Spatial Scalable Extension Structure. Marker bits not shown. 
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Lower_layer_horizontal_offset 

This 15-bit 2’s complement number indi- 
cates the horizontal offset of the top-left corner 
of the upsampled lower layer picture relative to 
the enhancement layer picture. This parameter 
must be an even number for the 4:2:0 and 4:2:2 
formats. 

Marker_bit 

Always a “1.” 

Lower_layer_vertical_offset 

This 15-bit 2’s complement number indi- 
cates the vertical offset of the top-left corner of 
the upsampled lower layer picture relative to 
the enhancement layer picture. This parameter 
must be an even number for the 4:2:0 format. 

Spatial_temporal_weight_code_table_index 

This 2-bit codeword indicates which spatial 
temporal weight codes are to be used. 

Lower_layer_progressive_frame 

This bit is “1” if the lower layer picture is 
progressive. 

Lower_layer_deinterlaced_field_select 

This bit is used in conjunction with other 
parameters to assist the decoder. See Table 
13.25. 

Copyright Extension 

This optional extension may only occur 
after a picture coding extension. 

Extension_start_code 

This 32-bit string of 0x000001B5 indicates 
the beginning of a new set of extension data. 



Extension_start_code_ID 

This 4-bit value of “0100” indicates the 
beginning of a copyright extension. 

Copyright_flag 

A “1” for this bit specifies the following 
video content, up to the next copyright exten- 
sion, is copyrighted. A “0” does not indicate 
whether the following video content is copy- 
righted or not. 

Copyright_identifier 

This 8-bit binary number indicates the 
copyright holder. A value of “0000 0000” indi- 
cates the information is not available. When 
copyright Jlag = “0,” copyrightjdentifier must 
be “0000 0000.” 

Original_or_copy 

A “1” for this bit indicates original material; 
a “0” indicates that it is a copy. 

Reserved 

These seven bits are always a “000 0000.” 

Marker_bit 

Always a “1.” 

Copyright_number_ 1 

These 20 bits represent bits 44-63 of the 
copyright number. 

Marker_bit 

Always a “1.” 

Copyright_number_2 

These 22 bits represent bits 22-43 of the 
copyright number. 
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Marker_bit 

Always a “1.” 

Copyright_number_3 

These 22 bits represent bits 0-21 of the 
copyright number. The 64-bit 
copyright jnumber uniquely identifies the copy- 
righted content. When copyright Jdentifier = 
“0000 0000,” or copyright Jlag = “0,” the 
copyright jnumber must be zero. 

Camera Parameters Extension 

This optional extension may only occur 
after a picture coding extension. 

After the 32-bit extension_start_code of 
0x000001B5, and 4-bit extension_start_code_ID 
of “1011,” there are several fields that specify 
the location and characteristics of the camera 
used. 

ITU-T ext. D Extension 

This optional extension may only occur 
after a picture coding extension. 

After the 32-bit extension_start_code of 
0x000001B5, and 4-bit extension_start_code_ID 
of “1100,” there is one bit of data. The use of 
this extension is defined in ITU-T H.320 Annex 
A. 

Slice Layer 

Data for each slice layer consists of a slice 
header followed by macroblock data. The 
structure is shown in Figure 13.5. 

Slice_start_code 

The first 24 bits have a value of 0x000001. 
The last 8 bits are slicejuertical Josition, and 
have a value of OxOl-OxAF. 



The slice_vertical_position specifies the 
vertical position in macroblock units of the 
first macroblock in the slice. The 

slicejuertical _position of the first row of mac- 
roblocks is one. 

Slice_vertical_position_extension 

This optional 3-bit binary number repre- 
sents the three MSBs of an 11-bit 

slicejuertical position value if the vertical size 
of the frame is >2800 lines. If the vertical size 
of the frame is <2800 lines, this field is not 
present. 

Priority_breakpoint 

This optional 7-bit binary number is 
present only when sequence_scalable_extension 
is present in the bitstream and scalable jmode = 
data partitioning. It specifies where in the bit- 
stream to partition. 

Quantizer_scale_code 

This 5-bit binary value has a value of 1 to 
31 (a value of zero is forbidden). It specifies 
the scale factor of the reconstruction level of 
the received DCT coefficients. The decoder 
uses this value until another 
quantizer _scale_code is received at either the 
slice or macroblock layer. 

Slice_extension_flag 

If this optional bit is set to a “1,” intra_slice, 
slice jpicture_ID_enable, and slice _picture_ID 
fields follow. 



Intra_slice 

This optional bit is present only if 
slice_extension Jlag = “1.” It must be set to a “0” 
if any macroblocks in the slice are non-intra 
macroblocks. 
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Slice_picture_ID_enable 

This optional bit is present only if 
slice _extension Jlag = “1.” A value of “1” indi- 
cates that slice _picture_ID is used. 

Slice_picture_ID 

These optional six bits are intended to aid 
in the recovery of severe error bursts for 
errors. Slice _picture_ID must have the same 
value for all slices of a picture. This field is 
present only if slice -extension Jlag = “1.” If 
slice _pidure_ID _enable = “0,” these bits must 
be “00 0000.” 



Extra_bit_slice 

A bit which, when set to “1,” indicates that 
extra -information _slice follows. A value of “0” 
indicates no data is after this field. 

Extra_information_slice 

If extra_bit_slice = “1,” these 9 bits follow 
consisting of 8 bits of data 
(i extra -information slice ) and then another 
extra _bit _slice to indicate if a further 9 bits fol- 
low, and so on. 

Macroblock Layer 

Data for each macroblock layer consists of 
a macroblock header followed by motion vec- 
tor and block data, as shown in Figure 13.5. 

Macroblock_escape 

This optional 11-bit field is a fixed bit string 
of “0000 0001 000” and is used when the differ- 
ence between the current macroblock address 
and the previous macroblock address is 
greater than 33. It forces the value of 
macroblock_address_increment to be increased 
by 33. Any number of consecutive 
macroblock_escape fields may be used. 



Macroblock_address_increment 

This is a variable-length codeword that 
specifies the difference between the current 
macroblock address and the previous macrob- 
lock address. It has a maximum value of 33. 
Values greater than 33 are encoded using the 
macroblock jescape field. The variable-length 
codes are listed in Table 13.26. 

Macroblock_type 

This variable-length codeword indicates 
the method of coding and macroblock content 
according to Tables 13.27, 13.28, and 13.29. 

Spatial_temporal_weight_code 

This optional 2-bit codeword indicates, in 
the case of spatial scalability, how the spatial 
and temporal predictions are combined to do 
the prediction for the macroblock. This field is 
present only if the [spatial temporal weight 
class] = 1 in Tables 13.27, 13.28, and 13.29, and 
spatial Jemporal _weight _code Jable _index v 
“ 00 .” 

Frame_motion_type 

This optional 2-bit codeword indicates the 
macroblock motion prediction, as shown in 
Table 13.30. It is present only if 
picture structure = frame, 

frame _pred Jrame_dct = “0,” and [motion for- 
ward] or [motion backward] = “1” in Tables 
13.27, 13.28, and 13.29. 

Field_motion_type 

This optional 2-bit codeword indicates the 
macroblock motion prediction, as shown in 
Table 13.31. It is present only if [motion for- 
ward] or [motion backward] = “1” in Tables 
13.27, 13.28, and 13.29 and the 

frame _motion _type field is not present. 
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Dct_type 

This optional bit indicates whether the 
macroblock is frame or field DCT coded. “1” = 
field, “0” = frame. It is present only if 
picture _structure = “11 "frame _pred Jramejdct 
= “0,” and [intra macroblock] or [coded pat- 
tern] = “1” in Tables 13.27, 13.28, and 13.29. 

Quantizer_scale_code 

This optional 5-bit binary number has a 
value of 1-31 (a value of zero is forbidden). It 
specifies the scale factor of the reconstruction 
level of the received DCT coefficients. The 
decoder uses this value until another 
quantizer_scale_code is received. This field is 
present only when [macroblock quant] = “1” in 
Tables 13.27, 13.28, and 13.29. 

Optional Motion Vectors 

Marker_bit 

This always has a value of “1.” It is present 
only if concealment_motion_vectors = “1” and 
[intra macroblock] = “1” in Tables 13.27, 13.28, 
and 13.29. 

Coded_block_pattern_420 

This optional variable-length codeword is 
used to derive the 4:2:0 coded block pattern 
(CBP) as shown in Table 13.32. It is present 
only if [coded pattern] = “1” in Tables 13.27, 
13.28, and 13.29, and indicates which blocks in 
the macroblock have at least one transform 
coefficient transmitted. The coded block pat- 
tern number is represented as: 

P 1 P 2 P 3 P 4 P 5 P 6 

where P n = “1” for any coefficient present for 
block [n], else P n = “0.” Block numbering (dec- 
imal format) is given in Figure 13.2. 



Coded_block_pattern_ 1 

Present only if chroma Jormat = 4:2:2 and 
[coded pattern] = “1” in Tables 13.27, 13.28, 
and 13.29. This optional 2-bit field is used to 
extend the coded block pattern by two bits. 

Coded_block_pattern_2 

Present only if chroma Jormat = 4:4:4 and 
[coded pattern] = “1” in Tables 13.27, 13.28, 
and 13.29. This optional 6-bit field is used to 
extend the coded block pattern by 6 bits. 

Block Layer 

Data for each block layer consists of coeffi- 
cient data. The structure is shown in Figure 
13.5. 

Dct_dc_size_luminance 

This optional variable-length code is 
present only for Y intra-coded blocks, and 
specifies the number of bits in the following 
dct_dc_differential. The values are shown in 
Table 13.33. 

Dct_dc_differential 

If dct_dc_size_luminan.ce ■£ 0, this optional 
variable-length code is present. The values are 
shown in Table 13.35. 

Dct_dc_size_chrominance 

This optional variable-length code is 
present only for CbCr intra-coded blocks, and 
specifies the number of bits in the following 
dct_dc_differential. The values are shown in 
Table 13.34. 

Dct_dc_differential 

If dct_dc_size_chrominance ^ “0,” this 
optional variable-length code is present. The 
values are shown in Table 13.35. 
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Increment 


Code 


Increment 


Code 


Value 


Value 


1 


1 


17 


0000 0101 10 


2 


Oil 


18 


0000 0101 01 


3 


010 


19 


0000 0101 00 


4 


0011 


20 


0000 0100 11 


5 


0010 


21 


0000 0100 10 


6 


0001 1 


22 


0000 0100 Oil 


7 


00010 


23 


0000 0100 010 


8 


0000 111 


24 


0000 0100 001 


9 


0000 110 


25 


0000 0100 000 


10 


0000 1011 


26 


0000 0011 111 


11 


0000 1010 


27 


0000 0011 110 


12 


0000 1001 


28 


0000 0011 101 


13 


0000 1000 


29 


0000 0011 100 


14 


0000 0111 


30 


0000 0011 Oil 


15 


0000 0110 


31 


0000 0011 010 


16 


0000 0101 11 


32 


0000 0011 001 




33 


0000 0011 000 


macroblockescape 


0000 0001 000 



Table 13.26. MPEG-2 Variable-Length Code Table for macroblock_address_increment. 
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Type 


Macroblock Quant 


Motion Forward 


Motion Backward 


Coded Pattern 


Intra Macroblock 


Spatial Temporal 
Weight Code Flag 


Permitted Spatial 
Temporal 
Weight Class 


Code 


intra 


0 


0 


0 


0 


1 


0 


0 


1 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


01 


1 Pictures with Spatial Scalability 


coded, 

compatible 


0 


0 


0 


1 


0 


0 


4 


1 


coded, 

compatible, 

quant 


1 


0 


0 


1 


0 


0 


4 


01 


intra 


0 


0 


0 


0 


1 


0 


0 


0011 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


0010 


not coded, 
compatible 


0 


0 


0 


0 


0 


0 


4 


0001 


1 Pictures with SNR Scalability 


coded 


0 


0 


0 


1 


0 


0 


0 


1 


coded, quant 


1 


0 


0 


1 


0 


0 


0 


01 


not coded 


0 


0 


0 


0 


0 


0 


0 


001 



Table 13.27. MPEG-2 Variable-Length Code Table for I Picture macroblock_type. 
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Type 


Macroblock Quant 


Motion Forward 


Motion Backward 


Coded Pattern 


Intra Macroblock 


Spatial Temporal 
Weight Code Flag 


Permitted Spatial 
Temporal 
Weight Class 


Code 


me, coded 


0 


1 


0 


1 


0 


0 


0 


1 


no me, coded 


0 


0 


0 


1 


0 


0 


0 


01 


me, not coded 


0 


1 


0 


0 


0 


0 


0 


001 


intra 


0 


0 


0 


0 


1 


0 


0 


0001 1 


me, coded, 
quant 


1 


1 


0 


1 


0 


0 


0 


00010 


no me, coded, 
quant 


1 


0 


0 


1 


0 


0 


0 


00001 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


0000 01 


P Pictures with SNR Scalability 


coded 


0 


0 


0 


1 


0 


0 


0 


1 


coded, quant 


1 


0 


0 


1 


0 


0 


0 


01 


not coded 


0 


0 


0 


0 


0 


0 


0 


001 



Table 13.28a. MPEG-2 Variable-Length Code Table for P Picture macroblock_type. 
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Type 


Macroblock Quant 


Motion Forward 


Motion Backward 


Coded Pattern 


Intra Macroblock 


Spatial Temporal 
Weight Code Flag 


Permitted Spatial 
Temporal 
Weight Class 


Code 


P Pictures with Spatial Scalability 


me, coded 


0 


1 


0 


1 


0 


0 


0 


10 


me, coded, 
compatible 


0 


1 


0 


1 


0 


1 


1,2,3 


Oil 


no me, coded 


0 


0 


0 


1 


0 


0 


0 


0000 100 


no me, coded, 
compatible 


0 


0 


0 


1 


0 


1 


1,2,3 


0001 11 


me, not coded 


0 


1 


0 


0 


0 


0 


0 


0010 


intra 


0 


0 


0 


0 


1 


0 


0 


0000 111 


me, not coded, 
compatible 


0 


1 


0 


0 


0 


1 


1,2,3 


0011 


me, coded, 
quant 


1 


1 


0 


1 


0 


0 


0 


010 


no me, coded, 
quant 


1 


0 


0 


1 


0 


0 


0 


0001 00 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


0000 110 


me, coded, 
compatible, quant 


1 


1 


0 


1 


0 


1 


1,2,3 


11 


no me, coded, 
compatible, quant 


1 


0 


0 


1 


0 


1 


1,2,3 


0001 01 


no me, not coded, 
compatible 


0 


0 


0 


0 


0 


1 


1,2,3 


0001 10 


coded, compatible 


0 


0 


0 


1 


0 


0 


4 


0000 101 


coded, compatible, 
quant 


1 


0 


0 


1 


0 


0 


4 


0000 010 


not coded, 
compatible 


0 


0 


0 


0 


0 


0 


4 


0000 0011 



Table 13.28b. MPEG-2 Variable-Length Code Table for P Picture macroblock_type. 





Video Bitstream 627 



Type 


Macroblock Quant 


Motion Forward 


Motion Backward 


Coded Pattern 


Intra Macroblock 


Spatial Temporal 
Weight Code Flag 


Permitted Spatial 
Temporal 
Weight Class 


Code 


interp, not coded 


0 


1 


1 


0 


0 


0 


0 


10 


interp, coded 


0 


1 


1 


1 


0 


0 


0 


11 


bwd, not coded 


0 


0 


1 


0 


0 


0 


0 


010 


bwd, coded 


0 


0 


1 


1 


0 


0 


0 


Oil 


fwd, not coded 


0 


1 


0 


0 


0 


0 


0 


0010 


fwd, coded 


0 


1 


0 


1 


0 


0 


0 


0011 


intra 


0 


0 


0 


0 


1 


0 


0 


0001 1 


interp, coded, quant 


1 


1 


1 


1 


0 


0 


0 


00010 


iwd, coded, quant 


1 


1 


0 


1 


0 


0 


0 


0000 11 


bwd, coded, quant 


1 


0 


1 


1 


0 


0 


0 


0000 10 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


0000 01 


B Pictures with Spatial Scalability 


interp, not coded 


0 


1 


1 


0 


0 


0 


0 


10 


interp, coded 


0 


1 


1 


1 


0 


0 


0 


11 


bwd, not coded 


0 


0 


1 


0 


0 


0 


0 


010 


bwd, coded 


0 


0 


1 


1 


0 


0 


0 


Oil 


iwd, not coded 


0 


1 


0 


0 


0 


0 


0 


0010 


iwd, coded 


0 


1 


0 


1 


0 


0 


0 


0011 


bwd, not coded, 
compatible 


0 


0 


1 


0 


0 


1 


1,2,3 


0001 10 


bwd, coded, compatible 


0 


0 


1 


1 


0 


1 


1,2,3 


0001 11 


Iwd, not coded, 
compatible 


0 


1 


0 


0 


0 


1 


1,2,3 


0001 00 


Iwd, coded, compatible 


0 


1 


0 


1 


0 


1 


1,2,3 


0001 01 


intra 


0 


0 


0 


0 


1 


0 


0 


0000 110 



Table 13.29a. MPEG-2 Variable-Length Code Table for B Picture macroblock_type. 
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Type 


Macroblock Quant 


Motion Forward 


Motion Backward 


Coded Pattern 


Intra Macroblock 


Spatial Temporal 
Weight Code Flag 


Permitted Spatial 
Temporal 
Weight Class 


Code 


B Pictures with Spatial Scalability (continued) 


interp, coded, 
quant 


1 


1 


1 


1 


0 


0 


0 


0000 111 


fwd, coded, 
quant 


1 


1 


0 


1 


0 


0 


0 


0000 100 


bwd, coded, 
quant 


1 


0 


1 


1 


0 


0 


0 


0000 101 


intra, quant 


1 


0 


0 


0 


1 


0 


0 


0000 0100 


fwd, coded, 
compatible, 
quant 


1 


1 


0 


1 


0 


1 


1,2,3 


0000 0101 


bwd, coded, 
compatible, 
quant 


1 


0 


1 


1 


0 


1 


1,2,3 


0000 0110 0 


not coded, 
compatible 


0 


0 


0 


0 


0 


0 


4 


0000 0111 0 


coded, quant, 
compatible 


1 


0 


0 


1 


0 


0 


4 


0000 0110 1 


coded, 

compatible 


0 


0 


0 


1 


0 


0 


4 


0000 0111 1 


B Pictures with SNR Scalability 


coded 


0 


0 


0 


1 


0 


0 


0 


1 


coded, quant 


1 


0 


0 


1 


0 


0 


0 


01 


not coded 


0 


0 


0 


0 


0 


0 


0 


001 



Table 13.29b. MPEG-2 Variable-Length Code Table for B Picture macroblock_type. 
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Spatial 

Temporal 

Weight 

Class 


Prediction 

Type 


Motion 

Vector 

Count 


Motion 

Vector 

Format 


Code 




reserved 






00 


0, 1 


field 


2 


field 


01 


2,3 


field 


1 


field 


01 


0, 1, 2, 3 


frame 


1 


frame 


10 


0, 2,3 


dual prime 


1 


field 


11 



Table 13.30. MPEG-2 frame _motion_type Codewords. 



Spatial 

Temporal 

Weight 

Class 


Prediction 

Type 


Motion 

Vector 

Count 


Motion 

Vector 

Format 


Code 




reserved 






00 


0, 1 


field 


1 


field 


01 


0, 1 


16 x 8 me 


2 


field 


10 


0 


dual prime 


1 


field 


11 



Table 13.31. MPEG-2 field _motion_type Codewords. 
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DCT DC 
Size 

Luminance 


Code 


DCT DC 
Size 

Luminance 


Code 


0 


100 


6 


11110 


1 


00 


7 


1111 10 


2 


01 


8 


1111 110 


3 


101 


9 


1111 1110 


4 


110 


10 


111111110 


5 


1110 


11 


1111 1111 1 



Table 13.33. MPEG-2 Variable-Length Code Table for dct_dc_size_luminance. 



DCT DC 
Size 

Chrominance 


Code 


DCT DC 
Size 

Chrominance 


Code 


0 


00 


6 


1111 10 


1 


01 


7 


1111 110 


2 


10 


8 


1111 1110 


3 


110 


9 


111111110 


4 


1110 


10 


1111 1111 10 


5 


11110 


11 


1111 1111 11 



Table 13.34. MPEG-2 Variable-Length Code Table for dct_dc_size_chrominance. 
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DCT DC 
Differential 


DCT DC 
Size 


Code 

(Y) 


Code 

(CbCr) 


Additional 

Code 


-2048 to -1024 


11 


111111111 


1111111111 


00000000000 to 01111111111 


-1023 to -512 


10 


111111110 


1111111110 


0000000000 to 0111111111 


-511 to -256 


9 


11111110 


111111110 


000000000 to 011111111 


-255 to -128 


8 


1111110 


11111110 


00000000 to 01111111 


-127 to -64 


7 


111110 


1111110 


0000000 to 0111111 


-63 to -32 


6 


11110 


111110 


000000 to 011111 


-31 to -16 


5 


1110 


11110 


00000 to 01111 


-15 to -8 


4 


110 


1110 


0000 to 0111 


-7 to -4 


3 


101 


110 


000 to Oil 


-3 to -2 


2 


01 


10 


00 to 01 


-1 


1 


00 


01 


0 


0 


0 


100 


00 




1 


1 


00 


01 


1 


2 to 3 


2 


01 


10 


10 to 11 


4 to 7 


3 


101 


110 


100 to 111 


8 to 15 


4 


110 


1110 


1000 to 1111 


16 to 31 


5 


1110 


11110 


ioooo to mil 


32 to 63 


6 


11110 


111110 


iooooo to mm 


64 to 127 


7 


111110 


1111110 


1000000 to 1111111 


128 to 255 


8 


1111110 


11111110 


10000000 to 11111111 


256 to 511 


9 


11111110 


111111110 


100000000 to 111111111 


512 to 1023 


10 


111111110 


1111111110 


1000000000 to 1111111111 


1024 to 2047 


11 


111111111 


1111111111 


10000000000 to 11111111111 



Table 13.35. MPEG-2 Variable-Length Code Table for dct_dc_differential. 
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Run 


Level 


Code 


EOB 




10 








0 


1 


Is 


if first coefficient 


0 


1 


lls 


not first coefficient 


0 


2 


0100 


s 






0 


3 


0010 


Is 






0 


4 


0000 


110s 






0 


5 


0010 


0110 


s 




0 


6 


0010 


0001 


s 




0 


7 


0000 


0010 


10s 




0 


8 


0000 


0001 


1101 


s 


0 


9 


0000 


0001 


1000 


s 


0 


10 


0000 


0001 


0011 


s 


0 


11 


0000 


0001 


0000 


s 


0 


12 


0000 


0000 


1101 


0s 


0 


13 


0000 


0000 


1100 


Is 


0 


14 


0000 


0000 


1100 


0s 


0 


15 


0000 


0000 


1011 


Is 


0 


16 


0000 


0000 


0111 


lls 


0 


17 


0000 


0000 


0111 


10s 


0 


18 


0000 


0000 


0111 


01s 


0 


19 


0000 


0000 


0111 


00s 


0 


20 


0000 


0000 


0110 


lls 


0 


21 


0000 


0000 


0110 


10s 


0 


22 


0000 


0000 


0110 


01s 


0 


23 


0000 


0000 


0110 


00s 


0 


24 


0000 


0000 


0101 


lls 


0 


25 


0000 


0000 


0101 


10s 


0 


26 


0000 


0000 


0101 


01s 


0 


27 


0000 


0000 


0101 


00s 


0 


28 


0000 


0000 


0100 


lls 


0 


29 


0000 


0000 


0100 


10s 


0 


30 


0000 


0000 


0100 


01s 



Note\ 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 13.36a. MPEG-2 Variable-Length Code Table Zero for 
dct coefficient first and dct coefficient next. 
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Run 


Level 


Code 


0 


31 


0000 


0000 


0100 


00s 


0 


32 


0000 


0000 


0011 


000s 


0 


33 


0000 


0000 


0010 


Ills 


0 


34 


0000 


0000 


0010 


110s 


0 


35 


0000 


0000 


0010 


101s 


0 


36 


0000 


0000 


0010 


100s 


0 


37 


0000 


0000 


0010 


Oils 


0 


38 


0000 


0000 


0010 


010s 


0 


39 


0000 


0000 


0010 


001s 


0 


40 


0000 


0000 


0010 


000s 


1 


1 


Oils 








1 


2 


0001 


10s 






1 


3 


0010 


0101 


s 




1 


4 


0000 


0011 


00s 




1 


5 


0000 


0001 


1011 


s 


1 


6 


0000 


0000 


1011 


0s 


1 


7 


0000 


0000 


1010 


Is 


1 


8 


0000 


0000 


0011 


Ills 


1 


9 


0000 


0000 


0011 


110s 


1 


10 


0000 


0000 


0011 


101s 


1 


11 


0000 


0000 


0011 


100s 


1 


12 


0000 


0000 


0011 


Oils 


1 


13 


0000 


0000 


0011 


010s 


1 


14 


0000 


0000 


0011 


001s 


1 


15 


0000 


0000 


0001 


0011s 


1 


16 


0000 


0000 


0001 


0010s 


1 


17 


0000 


0000 


0001 


0001s 


1 


18 


0000 


0000 


0001 


0000s 


2 


1 


0101 


s 






2 


2 


0000 


100s 






2 


3 


0000 


0010 


11s 




2 


4 


0000 


0001 


0100 


s 



Note: 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 13.36b. MPEG-2 Variable-Length Code Table Zero for 
dct coefficient first and dct coefficient next. 
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Run 


Level 


Code 


16 


1 


0000 


0010 


00s 




16 


2 


0000 


0000 


0001 


0101s 


17 


1 


0000 


0001 


1111 


s 


18 


1 


0000 


0001 


1010 


s 


19 


1 


0000 


0001 


1001 


s 


20 


1 


0000 


0001 


0111 


s 


21 


1 


0000 


0001 


0110 


s 


22 


1 


0000 


0000 


1111 


Is 


23 


1 


0000 


0000 


1111 


0s 


24 


1 


0000 


0000 


1110 


Is 


25 


1 


0000 


0000 


1110 


0s 


26 


1 


0000 


0000 


1101 


Is 


27 


1 


0000 


0000 


0001 


nils 


28 


1 


0000 


0000 


0001 


1110s 


29 


1 


0000 


0000 


0001 


1101s 


30 


1 


0000 


0000 


0001 


1100s 


31 


1 


0000 


0000 


0001 


1011s 


ESC 


0000 


01 







Note : 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 13.36d. MPEG-2 Variable-Length Code Table Zero for 
dct coefficient first and dct coefficient next. 
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Run 


Level 


Code 


EOB 


0110 








0 


1 


10s 








0 


2 


110s 








0 


3 


0111 


s 






0 


4 


1110 


0s 






0 


5 


1110 


Is 






0 


6 


0001 


01s 






0 


7 


0001 


00s 






0 


8 


1111 


Oils 






0 


9 


1111 


100s 






0 


10 


0010 


0011 


s 




0 


11 


0010 


0010 


s 




0 


12 


1111 


1010 


s 




0 


13 


1111 


1011 


s 




0 


14 


1111 


1110 


s 





Note : 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 13.37a. MPEG-2 Variable-Length Code Table One for 
dct coefficient first and dct coefficient next. 





Code 




for negative. 
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Run 


Level 


Code 


8 


1 


0000 


101s 






8 


2 


0000 


0001 


0001 


s 


9 


1 


1111 


000s 






9 


2 


0000 


0000 


1000 


Is 


10 


1 


1111 


010s 






10 


2 


0000 


0000 


1000 


0s 


11 


1 


0010 


0001 


s 




11 


2 


0000 


0000 


0001 


1010s 


12 


1 


0010 


0101 


s 




12 


2 


0000 


0000 


0001 


1001s 


13 


1 


0010 


0100 


s 




13 


2 


0000 


0000 


0001 


1000s 


14 


1 


0000 


0010 


Is 




14 


2 


0000 


0000 


0001 


0111s 


15 


1 


0000 


0011 


Is 




15 


2 


0000 


0000 


0001 


0110s 


16 


1 


0000 


0011 


01s 




16 


2 


0000 


0000 


0001 


0101s 


17 


1 


0000 


0001 


1111 


s 


18 


1 


0000 


0001 


1010 


s 


19 


1 


0000 


0001 


1001 


s 


20 


1 


0000 


0001 


0111 


s 


21 


1 


0000 


0001 


0110 


s 


22 


1 


0000 


0000 


1111 


Is 


23 


1 


0000 


0000 


1111 


0s 


24 


1 


0000 


0000 


1110 


Is 


25 


1 


0000 


0000 


1110 


0s 


26 


1 


0000 


0000 


1101 


Is 


27 


1 


0000 


0000 


0001 


nils 


28 


1 


0000 


0000 


0001 


1110s 


29 


1 


0000 


0000 


0001 


1101s 


30 


1 


0000 


0000 


0001 


1100s 


31 


1 


0000 


0000 


0001 


1011s 


ESC 




0000 


01 







Note : 

1. s = sign of level; “0” for positive; “1” for negative. 



Table 13.37d. MPEG-2 Variable-Length Code Table One for 
dct coefficient first and dct coefficient next. 
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Run Length 


Code 


0 


0000 


00 


1 


0000 


01 


2 


0000 


10 








62 


1111 


10 


63 


1111 


11 



Table 13.38. Run Encoding Following an Escape Code 
for dct_coefficient_first and dctcoefficientnext. 



Level 


Code 


-2047 


1000 


0000 


0001 


-2046 


1000 


0000 


0010 




-1 


1111 


1111 


1111 


forbidden 


0000 


0000 


0000 


1 


0000 


0000 


0001 




2047 


0111 


1111 


1111 



Table 13.39. Level Encoding Following an Escape Code 
for dct coefficient first and dct coefficient next. 
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Dct_coefficient_first 

This optional variable-length codeword is 
used for the first DCT coefficient in non-intra- 
coded blocks, and is defined in Tables 13.36, 

13.37. 13.38, and 13.39. 

Dct_coefficient_next 

Up to 63 optional variable-length code- 
words, present only for I, P, and B frames. 
They are the DCT coefficients after the first 
one, and are defined in Tables 13.36, 13.37, 

13.38, and 13.39. 

End_of_block 

This 2-bit or Tbit value is used to indicate 
that no additional non-zero coefficients are 
present. The value of this parameter is “10” or 
“ 0110 .” 



Motion Compensation 

Figure 13.14 illustrates the basic motion com- 
pensation process. Motion compensation 
forms predictions from previously decoded 
pictures, which are in turn combined with the 
coefficient data (error terms) from the IDCT. 

Field Prediction 

Prediction for P pictures is made from the two 
most recently decoded reference fields. The 
simplest case is shown in Figure 13.15, used 
when predicting the first picture of a frame or 
when using field prediction within a frame. 



Predicting the second field of a frame also 
requires the two most recently decoded refer- 
ence fields. This is shown in Figure 13.16 
where the second picture is the bottom field 
and in Figure 13.17 where the second picture 
is the top field. 

Field prediction for B pictures is made 
from the two fields of the two most recent ref- 
erence frames, as shown in Figure 13.18. 

Frame Prediction 

Prediction for P pictures is made from the 
most recently decoded picture, as shown in 
Figure 13.19. The reference picture may have 
been coded as either two fields or a single 
frame. 

Frame prediction for B pictures is made 
from the two most recent reference frames, as 
shown in Figure 13.20. Each reference frame 
may have been coded as either two fields or a 
single frame. 
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FROM 

VIDEO 

STREAM 




Figure 13.14. Simplified Motion Compensation Process 
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POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.15. P Picture Prediction of First Field or Field Prediction in a Frame Picture. 











TOP 

REFERENCE 

FIELD 








BOTTOM 

REFERENCE 
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POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.16. P Picture Prediction of Second Field Picture (Bottom Field). 
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TOP 

REFERENCE 

FIELD 






























BOTTOM 

REFERENCE 

FIELD 



POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.17. P Picture Prediction of Second Field Picture (Top Field). 




POSSIBLE 
INTERVENING 
B PICTURES 
ALREADY 
DECODED 



POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.18. Field Prediction of B Field or Frame Pictures. 






646 Chapter 13: MPEG-2 




POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.19. Frame Prediction for P Pictures. 




POSSIBLE 
INTERVENING 
B PICTURES 
ALREADY 
DECODED 



POSSIBLE 
INTERVENING 
B PICTURES 
NOT YET 
DECODED 



Figure 13.20. Frame Prediction for B Pictures 
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PES Packet 

A packetized elementary stream (PES) 
consists of a single elementary stream (ES) 
which has been made into packets, each start- 
ing with an added packet header. A PES con- 
tains only one type of data (audio, video, etc.) 
from one source. 

The general format of the PES packet is 
shown in Figure 13.2E Note that start codes 
(OxOOOOOlxx) must be byte aligned by insert- 
ing 0-7 “0” bits before the start code. 

Packet_start_code_prefix 

This 24-bit field has a value of 0x000001 
and in conjunction with stream_ID, indicates 
the beginning of a packet. 

Stream_ID 

This 8-bit code specifies the type and num- 
ber of elementary streams, as shown in Table 
13.40. For the ATSC and OpenCable™ stan- 
dards, the value for audio streams must be 
“1011 1101” to indicate Dolby® Digital. 

PES_packet_length 

This 16-bit binary number specifies the 
number of bytes in the PES packet following 
this field. A value of zero indicates it is neither 
specified nor bounded, and is used only in 
transport streams. For the ATSC standard, the 
value must be 0x0000 for video streams. 



Note: The following fields (until the next note) are not 
present if stream _ID = program stream map, padding 
stream, private stream 2, ECM stream, EMM stream, 
DSM-CC stream, H. 222.1 type E, or program stream 
directory. 

Marker_bits 

These optional 2 bits have a value of “10.” 

PES_scrambling_control 

This optional 2-bit code specifies the 
scrambling mode. “00” = not scrambled, “01” = 
reserved, “10” = scrambled with even key, “11” 
= scrambled with odd key. For the SYCD, 
ATSC, and OpenCable™ standards, the value 
must be “00.” 

PES_priority 

This optional bit specifies the priority of 
the payload of the PES packet. A “1” has a 
higher priority than a “0.” For the DVB stan- 
dard, this field is optional, and may be ignored 
by the decoder if present. For the SVCD stan- 
dard, this value must be “0.” 

Data_alignment_indicator 

A “1” for this optional bit indicates that the 
PES packet header is immediately followed by 
the video start code or audio syncword speci- 
fied by the Data Stream Alignment Descriptor 
(if present). For the SVCD standard, this value 
must be “0.” 
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PACKET 




PES 


START 


STREAM 


PACKET 


CODE 


ID 


LENGTH 


PREFIX 







PES 

SCRAMBLING 

CONTROL 


PES 

PRIORITY 


DATA 

ALIGNMENT 

INDICATOR 


COPY 

RIGHT 


ORIGINAL 

OR 

COPY 



PTS 

DTS 

FLAGS 


ESCR 

FLAG 


ES 

RATE 

FLAG 


DSM 

TRICK 

MODE 

FLAG 


ADDITIONAL 
COPY 
INFO FLAG 


PES 

CRC 

FLAG 


PES 

EXTENSION 

FLAG 


PES 

HEADER 

DATA 

LENGTH 



PTS AND ESCR ES RATE 

DTS FIELDS FIELDS FIELDS 



DSM 

TRICK 

MODE 

FIELDS 


ADDITIONAL 


PES 


PES 


COPY INFO 


CRC 


EXTENSION 


FIELDS 


FIELDS 


FIELDS 



DATA DATA 


DATA 


STUFFING 




BYTE BYTE 


BYTE 


BYTES 




1 2 


N 





DATA 


DATA 


DATA 




BYTE 


BYTE 


BYTE 




1 


2 


N 





PADDING 


PADDING 


PADDING 




BYTE 


BYTE 


BYTE 




1 


2 


N 



Figure 13.21. MPEG-2 PES Packet Structure. Marker and reserved bits not shown 
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Stream 


Code 


all audio streams 


1011 1000 


all video streams 


1011 1001 


program stream map 


1011 1100 


private stream 1 
(includes LPCM, Dolby Digital, 
Dolby Digital Plus, DTS, DTS-HD 
and MLP audio) 


1011 1101 


padding stream 


1011 1110 


private stream 2 


1011 1111 


MPEG- 1.3, -2.3, -4.3 or 
MPEG-2.7 audio stream 


llOxxxxx 


MPEG- 1.2, -2.2, -4.2 or 
MPEG-4.10 (H.264) 
video stream 


lllOxxxx 


ECM stream 


1111 0000 


EMM stream 


1111 0001 


DSM-CC stream 


1111 0010 


ISO/IEC 13552 stream 


1111 0011 


ITU-TH.222.1 type A 


1111 0100 


mj-TH.222.1 type B 


1111 0101 


ITU-TH.222.1 type C 


1111 0110 


ITU-TH.222.1 type D 


1111 0111 


ITU-TH.222.1 type E 


1111 1000 


ancillary_stream 


1111 1001 


MPEG-4 SGpacketized stream 


1111 1010 


MPEG-4 FlexMux stream 


1111 1011 


metadata stream 


1111 1100 


extended_stream„ID 


1111 1101 


reserved 


1111 1110 


program stream directory 


1111 1111 



Table 13.40. MPEG-2 stream JD 
Codewords. 



Stream 


Code 


IPMP control information stream 


000 0000 


IPMP stream 


000 0001 


reserved_data_stream 


000 0010 




reserved _data_stream 


Oil 1111 


private stream 


100 0000 




private stream 


101 0100 


SMPTE 42 1M (VC-1) video stream 


101 0101 


Dolby Digital, Dolby Digital Plus, 
DTS or DTS-HD core audio stream 


111 0001 


MLP or DTS-HD extension audio 
stream 


111 0010 



Table 13.41. Some Common MPEG-2 
stream JD_extension Codewords. 



Copyright 

A “1” for this bit indicates that the material 
is copyrighted. For the SVCD standard, this 
value must be “0.” 

Original_or_copy 

A “1” for this bit indicates that the material 
is original. A “0” indicates it is a copy. For the 
SVCD standard, this value must be “1.” 

PTS_DTS_flags 

A value of “10” for these two bits indicates 
a PTS (presentation time stamp) field is 
present in the PES packet header. A value of 
“11” indicates both PTS and DTS (decoding 
time stamp) fields are present. A value of “00” 
indicates neither PTS or DTS fields are 
present. For the SVCD standard, this value 
must be “00,” “10,” or “11” (“11” only for 
video) . 
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ESCRJlag 

A “1” for this bit indicates the ESCR (ele- 
mentary stream clock reference) base and 
extension fields are present in the PES packet 
header. For the SVCD, ATSC, and OpenCa- 
ble standards, the value must be “0.” For the 
DVB standard, the ESCR fields are optional, 
and may be ignored by the decoder if present. 

ES_rate_flag 

A “1” for this bit indicates ES_rate (ele- 
mentary stream rate) is present in the PES 
packet header. For the SYCD, ATSC, and 
OpenCable standards, the value must be “0.” 
For the DVB standard, the ES fields are 
optional, and may be ignored by the decoder if 
present. 

DSM_trick_mode_flag 

A “1” for this bit indicates that the 

trick _mode jcontrol field is present. For the 
SVCD standard, this value must be “0.” 

Additional_copy_info_flag 

A “1” for this bit indicates that the 

additional_copy_info field is present. For the 

SVCD standard, this value must be “0.” 

PES_CRC_flag 

A “1” for this bit indicates that the 

previous J^ES Jacket JRC field is present. For 
the SVCD, ATSC, and OpenCable™ standards, 
the value must be “0.” 

PES_extension_flag 

A “1” for this bit indicates that an extension 
field is present in this PES packet header. 
When conveying SMPTE 421M (VC-1) video 
streams, this bit must be a “1” to enable the 
insertion of extensions in the PES packet 
header. 



PES_header_data_length 

This 8-bit binary number specifies the 
number of bytes for optional fields and stuffing 
in this PES packet header. 



Marker_bits 

These optional 4 bits have a value of 
“0010.” This field is present only if 
PTS_DTS Jlags = “10.” 

PTS [32-30] 

This optional field is present only if 
PTS_DTS Jlags = “10.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTS_I)TS Jags = “10.” 

PTS [29-15] 

This optional field is present only if 
PTS_DTS Jags = “10.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jags = “10.” 

PTS [14-0] 

The optional 33-bit presentation time 
stamp (PTS) indicates the intended time of dis- 
play by the decoder. It is specified in periods of 
the 27 MHz clock divided by 300. This field is 
present only if PTS_DTS Jags = “10.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jags = “10.” 
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Marker_bits 

These optional 4 bits have a value of 
“0011.” This field is present only if 
PTS_DTS Jags = “11.” 

PTS [32-30] 

This optional field is present only if 
PTSJJTS Jags = “11.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jlags = “11.” 

PTS [29-15] 

This optional field is present only if 
PTSJJTS Jags = “11.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTS_DTS Jlags = “11.” 

PTS [14-0] 

This optional field is present only if 
PTS_DTS Jlags = “11.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jlags = “11.” 

Marker_bits 

These optional 4 bits have a value of 
“0001.” This field is present only if 
PTS_DTS Jlags = “11.” 

DTS [32-30] 

This optional field is present only if 
PTSJJTS Jags = “ nr 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jlags = “11.” 



DTS [29-15] 

This optional field is present only if 
PTS_DTS Jlags = “11.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jlags = “11.” 

DTS [14-0] 

The optional 33-bit decoding time stamp 
(DTS) indicates the intended time of decoding. 
It is specified in periods of the 27 MHz clock 
divided by 300. This field is present only if 
PTSJJTS Jlags = “11.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if PTSJJTS Jlags = “11.” 



Reserved_bits 

These optional 2 bits have a value of “11.” 
This field is present only if ESCR Jlag = “1.” 

ESCRbase [32-30] 

This optional field is present only if 
ESCR Jag = “1.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ESCR Jlag = “1.” 

ESCRbase [29-15] 

This optional field is present only if 
ESCR Jlag = “1.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ESCR Jlag = “1.” 
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ESCRbase [14-0] 

This optional field is present only if 
ESCR. Jlag= “1.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ESCR Jag = “1.” 

ESCR_extension 

The optional 9-bit elementary stream clock 
reference (ESCR) extension and the 33-bit 
ESCR base are combined into a 42-bit value. It 
indicates the intended time of arrival of the 
byte containing the last bit of ESCR_base. The 
value of ESCR_base specifies the number of 90 
kHz clock periods. 

The value of ESCR_extension specifies the 
number of 27 MHz clock periods after the 90 
kHz period starts. 

This field is present only if ESCR Jlag = 

“ 1 ” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ESCR Jlag = “1.” 



Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ES_rate Jlag = “1.” 

ES_rate 

This optional 22-bit elementary stream rate 
{ESjrate) indicates the rate the decoder 
receives bytes of the PES packet. It is specified 
in units of 50 bytes per second. This field is 
present only if ES_rate Jlag = “1.” 

Marker_bit 

This optional bit has a value of “1.” This 
field is present only if ES_rate Jlag = “1.” 



Trick_mode_control 

This optional 3-bit codeword indicates 
which trick mode is applied to the video 
stream, as shown in Table 13.42. This field is 
present only if DSMjrick_mode Jlag = “1.” 



Trick Mode 


Code 


fast forward 


000 


slow forward 


001 


freeze frame 


010 


fast reverse 


Oil 


slow reverse 


100 


reserved 


101 


reserved 


110 


reserved 


111 



Table 13.42. MPEG-2 trick_mode_control 
Codewords. 

Field_ID 

This optional 2-bit codeword indicates 
which fields are to be displayed, as shown in 
Table 13.43. This field is present only if 
DSMjrickjmode. Jlag = “1” and 

trick_mode_control = “000” or “Oil.” 



Field ID 


Code 


top field only 


00 


bottom field only 


01 


complete frame 


10 


reserved 


11 



Table 13.43. MPEG-2 field JD Codewords. 

Intra_slice_refresh 

A “1” for this optional bit indicates that 
there may be missing macroblocks between 
slices. This field is present only if 
DSMjrickjmode, Jag = “1” and 

trick mode control = “000” or “Oil.” 
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Frequency_truncation 

This optional 2-bit codeword indicates that 
a restricted set of coefficients may have been 
used in coding the data, as shown in Table 
13.44. This field is present only if 
DSM_trick_mode Jlag = “1” and 

trick_modejcontrol = “000” or “Oil.” 



Description 


Code 


only DC coefficients are non-zero 


00 


first 3 coefficients are non-zero 


01 


first 6 coefficients are non-zero 


10 


all coefficients may be non-zero 


11 



Table 13.44. MPEG-2 frequency ^truncation 
Codewords. 



Rep_cntrl 

This optional 5-bit binary number indicates 
the number of times each interlaced field or 
progressive frame should be displayed. A value 
of “00000” is not allowed. This field is present 
only if DSM_trick_mode Jlag = “1” and 
trick_mode_control = “001” or “100.” 



Field.ID 

This optional 2-bit codeword shown in 
Table 13.42 indicates which fields are to be dis- 
played. This field is present only if 
DSM_trick_mode Jlag = “1” and 

trick_mode_control = “010.” 

Reserved_bits 

These 3 optional bits have a value of “111.” 
This field is present only if 
DSMjrick_mode Jlag = “1” and 

trick_mode_control = “010.” 



Reserved_bits 

These 5 bits have a value of “1 1111.” This 
field is present only if DSMjrick_mode Jlag = 
“1” and trick_mode_control = “101,” “110,” or 

“ 111 .” 



Marker_bit 

This optional bit is always a “1.” This field 
is present only if additional jcopyjnfo Jlag = 
“ 1 .” 

Additional_copy_info 

This optional 7-bit field contains private 
data regarding copyright information. This 
field is present only if additional jcopyjnfo Jlag 
= “ 1 .” 



Previous_PES_packet_CRC 

These optional 16 bits are present only if 
PESjCRC Jlag = “1.” For the DVB standard, 
this field is optional, and may be ignored by the 
decoder if present. 



PES_private_data_flag 

A “1” for this optional bit indicates that pri- 
vate data is present. This field is present only if 
PES_extension Jlag = “1.” For the SVCD, ATSC, 
and OpenCable standards, the value must be 
“ 0 .” 

Pack_header_field_flag 

A “1” for this bit indicates that an MPEG-1 
pack header or program stream pack header is 
in this PES packet header. This field is present 
only itPESjextension Jlag = “1.” For the SVCD, 
ATSC, and OpenCable standards, the value 
must be “0.” 
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Program_packet_sequence_counter_flag 

A “1” for this optional bit indicates that the 
program Jacket_sequencesounter, 
MPEGl_MPEG2_identifier, and 

original _stuff Jength fields are present. This 
field is present only if PES_extension Jlag = “1.” 
For the SVCD, ATSC, and OpenCable™ stan- 
dards, the value must be “0.” 

P-STD_buffer_flag 

A “1” for this optional bit indicates that P- 
STI) _buffer _sc.ale and P-STD -buffer size are 
present. This field is present only if 
PES_extension Jlag = “1.” For the ATSC and 
OpenCable standards, the value must be “0.” 
For the SVCD standard, this value must be “1.” 

Reserved_bits 

These optional 3 bits are always “111.” 
This field is present only if PES_extension Jlag 
= “ 1 .” 

PES_extension_flag_2 

A “1” for this optional bit indicates that 
PES_extension JeldJength and associated 

fields are present. This field is present only if 
PES_extension Jlag = “1.” For the SVCD stan- 
dard, this value must be “0.” When conveying 
SMPTE 421M (VC-1) video streams, this bit 
must be a “1” to enable the insertion of the sec- 
ond group of extensions in the PES packet 
header. 



PES_private_data 

These optional 128 bits of private data, 
combined with the fields before and after, must 
not emulate the packet start sode Jrefix. This 
field is present only if PES_extension Jlag = “1” 
and PES -private _data Jlag = “1.” For the DVB 
standard, this field is optional, and may be 
ignored by the decoder if present. 



Pack_field_length 

This optional 8-bit binary number indicates 
the length, in bytes, of an immediately follow- 
ing pack header. This field, and the immedi- 
ately following pack header, are present only if 
PES_extension Jlag = “1” and 

packjieader Jield Jlag=“l.” 



Marker_bit 

This optional bit is always a “1.” This field 
is present only if PES_extension Jlag = “1” and 
program Jacket_sequence_counter Jag = “1.” 

Program_packet_sequence_counter 

This optional 7-bit binary number incre- 
ments with each successive PES packet in a 
program stream or MPEG-1 system stream. It 
wraps around to zero after reaching its maxi- 
mum value. No two consecutive PES packets 
can have the same values. This field is present 
only if PES -extension Jag = “1” and 

program Jacketsequencesounter Jag = “1.” 
For the DVB standard, this field is optional, 
and may be ignored by the decoder if present. 

Marker_bit 

This optional bit is always a “1.” This field 
is present only if PES_extension Jag = “1” and 
program Jacketsequencesounter Jag= “1.” 

MPEG 1 _MPEG2_identifier 

A “1” for this optional bit indicates the PES 
packet has information from an MPEG-1 sys- 
tem stream. A “0” indicates the PES packet has 
information from a program stream. This field 
is present only if PES -extension Jag = “1” and 
program Jacket_sequence_counter Jag = “1.” 
For the DVB standard, this field is optional, 
and may be ignored by the decoder if present. 
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Original_stuff_length 

This optional 6-bit binary number specifies 
the number of stuffing bytes used in the origi- 
nal PES or MPEG-1 packet header. This field is 
present only if PES_extension Jlag = “1” and 
programJ)acket_sequence_counter Jlag = “1.” 
For the DVB standard, this field is optional, 
and may be ignored by the decoder if present. 



Marker_bits 

These optional two bits are always “01.” 
This field is present only if PES_extension Jlag 
and P-STD_buffer Jlag= “1.” 

P-STD_buffer_scale 

This optional bit indicates the scaling fac- 
tor for the following P-STD_buffer_size parame- 
ter. For audio streams, a value of “0” is present. 
For video streams, a value of “1” is present. For 
all other types of streams, a value of “0” or “1” 
may be used. This field is present only if 
PES_extension Jlag and P-STD_buffer Jlag = 
“1.” For the DVB standard, this field is 

optional, and may be ignored by the decoder if 
present. 

P-STD_buffer_size 

This optional 13-bit binary number speci- 
fies the size of the decoder input buffer. If P- 
STD Jruffer scale is a “0,” the unit is 128 bytes. 
If P-STD_buffer_scale is a “1,” the unit is 1024 
bytes. This field is present only if 
PES_extension Jlag and P-STD_buffer Jlag = 
“1.” For the DVB standard, this field is 

optional, and may be ignored by the decoder if 
present. 



Marker_bit 

This optional bit is always a “1.” This field 
is present only if PES_extension Jlag and 
PES_extension Jlag_2 = “1.” 

PES_extension_field_length 

An optional 7-bit binary number that indi- 
cates the total number of bytes for the next 
three fields. This field is present only if 
PES_extension Jlag and PES_extension Jlag_2 = 

“ 1 .” 

Stream_ID_extension_flag 

A “0” for this optional l-bit flag indicates 
that a stream_ID _extension field follows. This 
field is present only if PES_extension Jlag and 
PES_extension Jlag_2 = “1.” When conveying 
SMPTE 421M (VC-1) video streams, this bit 
must be a “0” to enable the insertion of a valid 
stream Jd_extension. 



Stream_ID_extension 

This 7-bit codeword is used as an exten- 
sion to stream_ID to specify the elementary 
stream type as defined in Table 13.41. This 
field is not used unless stream_ID = ‘Till 
1101.” This optional field is present only if 
PES_extension Jlag and PESjextension Jlag_2 = 
“1” and stream _ID -extension Jlag = “0.” 

Reserved_byte 

[n] bytes of reserved data with a value of 
‘Till 1111.” This optional field is present only 
if PES -extension Jlag and PESjextension Jlag_2 
= “1” and stream _ID -extension Jlag = “0.” 



Stuffing_byte 

[n] optional bytes with a value of ‘Till 
1111.” Up to 32 stuffing bytes may be used. 
They are ignored by the decoder. 
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PES_packet_data_byte 

[n] bytes of data from the audio stream, 
video stream, private stream 1, ancillary 
stream, H.222.1 types A-D stream, or ISO/IEC 
13552 stream. The number of bytes is derived 
from the PES_packet_length field. 



Note: The following field is present if stream _ID = pro- 
gram stream map, private stream 2, ECM stream, EMM 
stream, DSM-CC stream, H.222. 1 type E, or program 
stream directory. 

PES_packet_data_byte 

[n] bytes of data from program stream 
map, private stream 2, ECM stream, EMM 
stream, DSM-CC stream, H.222.1 type E 
stream, or program stream directory descrip- 
tors. The number of bytes is derived from the 
PES_packet_length field. 



Note: The following field is present if stream_ID = pad- 
ding stream. 

Padding_byte 

[n] bytes that have a value of ‘Till 1111.” 
The number of bytes is specified by 
PES _packet_length. It is ignored by the 
decoder. 



Program Stream 

The program stream, used by the DVD 
and SVCD standards, is designed for use in rel- 
atively error-free environments. It consists of 
one or more PES packets multiplexed together 
and coded with data that allows them to be 
decoded in synchronization. Program stream 
packets may be of variable and relatively great 
length. 

The general format of the program stream 
is shown in Figure 13.22. Note that start codes 
(OxOOOOOlxx) must be byte aligned by insert- 
ing 0-7 “0” bits before the start code. 
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Figure 13.22. MPEG-2 Program Stream Structure. Marker and reserved bits not shown. 
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Pack Layer 

Data for each pack consists of a pack 
header followed by an optional system header 
and one or more PES packets. 

Pack_start_code 

This 32-bit string has a value of 
OxOOOOOlBA and indicates the beginning of a 
pack. 

Marker_bits 

These 2 bits have a value of “01.” 

System_clock_reference_base [32-3 0 ] 
Marker_bit 

This bit has a value of “1.” 
System_clock_reference_base [29-15] 
Marker_bit 

This bit has a value of “1.” 

System_clock_reference_base [ 14-0] 

Marker_bit 

This bit has a value of “1.” 

System_clock_reference_extension 

This 9-bit field, along with the 
systemjclock_reference_base field, comprises 
the system clock reference (SCR). The 
system jclock_reference_base field is specified in 
units of 1/300 multiplied by 90 kHz. The 
system jclock_reference_extension is specified in 
units of 27 MHz. SCR indicates the intended 
time of arrival of the byte containing the last bit 
of the system_clock_reference_base. 



Marker_bit 

This bit has a value of “1.” 

Program_mux_rate 

This 22-bit binary number specifies a value 
measured in 50 bytes per second, with a value 
of zero forbidden. It indicates the rate at which 
the decoder receives the program stream. For 
the SVCD standard, this value must be <6972 f) . 

Markerjnt 

This bit has a value of “1.” 

Markerjnt 

This bit has a value of “1.” 

Reserved_bits 

These five bits have a value of “1 1111.” 

Pack_stuffing_length 

This 3-bit binary number specifies the 
number of stuffingjbyte fields following this 
field. 

StuffingJjyte 

0-7 stuffing bytes may be present. They 
are ignored by the decoder. Each byte has a 
value of ‘Till 1111.” 

System Header 

This field contains a summary of the bit- 
stream parameters. There must be one follow- 
ing the first pack header, and then it may be 
optionally repeated in future pack headers. 

System_header_start_code 

This 32-bit string has a value of 
OxOOOOOIBB and indicates the beginning of a 
system header. 
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Header_length 

This 16-bit binary number specifies the 
number of bytes of the system header follow- 
ing this field. 

Marker_bit 

This bit has a value of “1.” 

Rate_bound 

This 22-bit binary number specifies a value 
greater than or equal to the maximum value of 
program _mux_rate. For the SVCD standard, 
this value must be 6972]). 

Marker_bit 

This bit has a value of “1.” 

Audio_bound 

This 6-bit binary number specifies a value 
(0 to 32) that is greater than or equal to the 
maximum number of active audio streams. For 
the SVCD standard, this value must be 0, 1, or 
2 . 

Fixed_flag 

A “1” for this bit indicates fixed bit-rate 
operation. A “0” indicates variable bit-rate oper- 
ation. For the SVCD standard, this value must 
be “0.” 

CSPSJlag 

This bit is a “1” if the program stream is a 
“constrained system parameters stream” 
(CSPS). 

System_audio_lock_flag 

This bit is a “1” if there is a specified, con- 
stant relationship between the audio sampling 
rate and the decoder system clock frequency. 
For the SVCD standard, this value must be “1.” 



System_video_lock_flag 

This bit is a “1” if there is a specified, con- 
stant relationship between the video picture 
rate and the decoder system clock frequency. 
For the SVCD standard, this value must be “1.” 

Marker_bit 

This bit has a value of “1.” 

Video_bound 

This 5-bit binary number specifies a value 
(0 to 16) that is greater than or equal to the 
maximum number of active video streams. For 
the SVCD standard, this value must be 0 or 1. 

Reserved_bits 

These 7 bits have a value of “111 1111.” 
StreamJD 

This optional 8-bit code shown in Table 
13.40 indicates the stream to which the P- 
STD_buffer_bound_scale and P- 

STD_buffer_size_bound fields apply. 



Marker_bits 

These optional 2 bits have a value of “11.” 
They are present only if stream _ID is present. 

P-STD_buffer_bound_scale 

This optional bit is present only when 
stream_ID is present and indicates the scaling 
factor used for P_STD_buffer_size_bound. A “0” 
indicates the stream_ID specifies an audio 
stream. A “1” indicates that stream_ID speci- 
fies a video stream. For other types of stream 
IDs, the value may be either a “0” or a “1.” 
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P-STD_buffer_size_bound 

This optional 13-bit binary number speci- 
fies a value greater than or equal to the maxi- 
mum decoder input buffer size. It is present 
only when stream_ID is present. If 
P_STD_buffer_bound_scale is a “0,” the unit is 
128 bytes. If P_STD _buffer_bound_scale is a 
“1,” the unit is 1024 bytes. 

Program Stream Map (PSM) 

The program stream map (PSM) provides 
a description of the bitstreams in the program 
stream, and their relationship to one another. It 
is present as PES packet data if stream_ID = 
program stream map. 

Packet_start_code_prefix 

This 24-bit string has a value of 0x000001 
and indicates the beginning of a stream map. 

Map_stream_ID 

This 8-bit string has a value of “1011 1100.” 

Program_stream_map_length 

This 16-bit binary number indicates the 
number of bytes following this field. It has a 
maximum value of 1018]> 

Current_next_indicator 

A “1” for this bit indicates that the program 
stream map is currently applicable. A “0” indi- 
cates the program stream map is not applicable 
yet and will be the next one valid. 

Reserved_bits 

These 2 bits have a value of “11.” 



Program_stream_map_version 

This Tbit binary number specifies the ver- 
sion number of the program stream map. It 
must be incremented by one when the pro- 
gram stream map changes, wrapping around 
to zero after reaching a value of 31. 

Reserved_bits 

These 7 bits have a value of “111 1111.” 
Marker_bit 

This bit always has a value of “1.” 

Program_stream_info_length 

This 16-bit binary number specifies the 
total length in bytes of the descriptors immedi- 
ately following this field. 

Descriptor_loop 

Various descriptors may be present in this 
descriptor Joop. 

Elementary_stream_map_length 

This 16-bit binary number indicates the 
number of bytes of all elementary stream infor- 
mation in this program stream map. 



Note: The following four fields are present for each 
stream that has a unique streamjype value. 

Stream_type 

This 8-bit codeword specifies the type of 
stream as shown in Table 13.45. 

Elementary_stream_ID 

This 8-bit field specifies the value of 
stream_ID, as listed in Table 13.40, in the PES 
packet header of PES packets containing this 
bitstream. 




660 Chapter 13: MPEG-2 



Stream Type 


Code 


Stream Type 


Code 


reserved 


0000 0000 


MPEG-4.2 visual 


0001 0000 


MPEG- 1.2 video 


0000 0001 


MPEG-4.3 audio 


0001 0001 


MPEG-2.2 video 


0000 0010 


MPEG-4 SL-packetized 
stream or FlexMux stream 
carried in PES packets 


0001 0010 


MPEG- 1.3 audio 


0000 0011 


MPEG-4 SL-packetized 
stream or FlexMux stream 
carried in MPEG-4 sections 


0001 0011 


MPEG-2.3 audio 


0000 0100 


MPEG-2.6 Synchronized 
Download Protocol 


0001 0100 


MPEG-2 private sections 


0000 0101 


Metadata carried in PES 
packets 


0001 0101 


MPEG-2 PES packets 
containing private data 


0000 0110 


Metadata carried in 
metadata sections 


0001 0110 


ISO/IEC 13522 MHEG 


0000 0111 


Metadata carried in MPEG- 
2.6 Data Carousel 


0001 0111 


MPEG-2 DSM CC 


0000 1000 


Metadata carried in MPEG- 
2.6 Object Carousel 


0001 1000 


MPEG-1, MPEG-2 auxiliary 


0000 1001 


Metadata carried in MPEG- 
2.6 Synchronized Download 
Protocol 


0001 1001 


MPEG-2.6 Type A 


0000 1010 


MPEG-2.11 IPMP stream 


0001 1010 


MPEG-2.6 Type B 


0000 1011 


MPEG-4.10 video 


0001 1011 


MPEG-2.6 Type C 


0000 1100 


reserved 


0001 1100-0111 1110 


MPEG-2.6 Type D 


0000 1101 


IPMP stream 


0111 1111 


MPEG-2 auxiliary 


0000 1110 


user private 


1000 0000-1111 1111 


MPEG-2.7 audio 


0000 1111 






some common user private details 


DigiCipher II video 


1000 0000 


ATSC Data Service Table, 
Network Resources Table 


1001 0101 


Dolby Digital audio 


1000 0001 


SCTE IP data 


1010 0000 


SCTE standard subtide 


1000 0010 


ATSC synchronous data 
stream or 

SCTE isochronous data 


1100 0010 


SCTE isochronous data 


1000 0011 


SCTE asynchronous data 


1100 0011 


ATSC program identifier 


1000 0101 




Dolby Digital Plus audio 


1000 0111 



Table 13.45. Common streamjtype Codewords. The Code Point Registry at 
www.atsc.org provides a complete listing of streamjtype codes. 
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Elementary_stream_info_length 

This 16-bit binary number specifies the 
total length in bytes of the descriptors immedi- 
ately following this field. 

Descriptorjoop 

Various descriptors may be present in this 
descriptorjoop. 

CRC_32 

This 32-bit CRC is for the entire program 
stream map. 

Program Stream Directory 

The program stream directory provides a 
description of the bitstreams in the program 
stream, and their relationship to one another. It 
is present as PES packet data if stream_ID = 
program stream map. 



Transport Stream 

Designed for use in environments where 
errors are likely, such as transmission over 
long distances or noisy environments, trans- 
port streams are used by the ARIB, ATSC, 
DVB, digital cable, and OpenCable stan- 
dards. 

A transport stream combines one or more 
programs, with one or more independent time 
bases, into a single stream. Each program in a 
transport stream may have its own time base. 
The time bases of different programs within a 
transport stream may be different. 

The transport stream consists of one or 
more 188-byte packets. The data for each 
packet is from PES packets, PSI (Program Spe- 
cific Information) sections, stuffing bytes, or 
private data. In addition to MPEG-2 data, 



MPEG-4.2, MPEG-4.10 (H.264), SMPTE 421M 
(V C-l) , and other data may also be sent using 
MPEG-2 transport streams. 

At the start of each packet is a Packet 
IDentifier (PID) that enables the decoder to 
determine what to do with the packet. If the 
MPEG data is sent using “multiple channels 
per carrier,” the decoder uses the PIDs to 
determine which packets are part of the cur- 
rent channel being watched or recorded and 
therefore process them, discarding the rest. 
System Information (SI), such as program 
guides, channel frequencies, etc., are also 
assigned unique PID values. 

The general format of the transport stream 
is shown in Figure 13.23. Note that start codes 
(OxOOOOOlxx) must be byte aligned by insert- 
ing 0-7 “0” bits before the start code. 

Packet Layer 

Data for each packet consists of a packet 
header followed by an optional adaptation field 
and/ or one or more data packets. 

Sync_byte 

This 8-bit string has a value of “0100 0111.” 

Transport_error_indicator 

A “1” for this bit indicates that at least one 
uncorrectable bit error is present in the 
packet. 

Payload_unit_start_indicator 

The meaning of this bit is dependent on 
the payload. 

For PES packet data, a “1” indicates the 
data block in this packet starts with the first 
byte of a PES packet. A “0” indicates no PES 
packet starts in the data block of this packet. 
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Figure 13.23. MPEG-2 Transport Stream Structure. Marker and reserved bits not shown. Some 
applications add a 4-byte TP_extra_header prefix (resulting in 192 bytes), consisting of arrival time 
stamp and copy permission information. 



For PSI data, a “1” indicates the data block 
of this packet contains the first byte of a PSI 
section. 

Transport_priority 

A “1” for this bit indicates that this packet 
is of higher priority than other packets having 
the same PID. For the DVB standard, this field 
is optional, and may be ignored by the decoder 
if present. 

PID 

This 13-bit codeword indicates the type of 
data in the data block, as shown in Table 13.46. 
The Code Point Registry at www.atsc.org pro- 
vides a complete listing of PID codes. 

Transport_scrambling_control 

This 2-bit code indicates the scrambling 
mode of the payload. “00” = not scrambled, 
“01” = not scrambled (private use), “10” = 
scrambled with even key, “11” = scrambled 
with odd key. A value other than “00” requires 
a CA descriptor be present in the elementary 
stream. 



Description 


Code 


program association table 


0 0000 0000 0000 


conditional access table 


0 0000 0000 0001 


transport stream 
description table 


0 0000 0000 0010 


IPMP control information 
table 


0 0000 0000 0011 


MPEG-2 reserved 


0 0000 0000 0100 
through 

0 0000 0000 1111 


used by DVB 


0 0000 0001 0000 
through 

0 0000 0001 1111 


used by ARIB 


0 0000 0010 0000 
through 

0 0000 0010 1111 


used by ATSC, CEA, and 
SCTE 


1111111110111 

through 

1 1111 1111 1110 


MPEG-2 null packet 


1 1111 1111 1111 



Table 13.46. Common PID Codewords. 
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Adaptation_field_control 

This 2-bit code indicates whether this 
packet header is followed by an adaptation 
field or data block, as shown in Table 13.47. 



Description 


Code 


reserved 


00 


data only 


01 


adaptation field only 


10 


adaptation field followed by 
data 


11 



Table 13.47. MPEG-2 adaptation Jieidjcontro! 
Codewords. 

Continuity_counter 

This Tbit binary number increments with 
each packet with the same PID. After reaching 
the maximum value, it wraps around. It does 
not increment when no data block is present in 
the packet. 

Adaptation_field 

See the Adaptation Field section. 

Data_byte 

These [n] data bytes are contiguous bytes 
of data from PES packets, PSI sections, stuff- 
ing bytes, or private data, [n] = 184 minus the 
number of data bytes in the Adaptation Field (if 
present) . This field is present if 

adaptation _field_control = “01” or “11.” 

Adaptation Field 

This field contains the 42-bit program 
clock references (PCRs), random access indi- 
cator and splice countdown, as well as other 
MPEG and private information. The PCRs are 
used to recreate the same 27 MHz time base 
clock at the decoder that was used at the 



encoder. This is the clock on which the presen- 
tation time stamps (PTS) are based. PCRs typi- 
cally occur every 0.1 second in the stream. 
This field is present if adaptation Jieldjcontrol 
= “10” or “11.” 

Adaptation_field_length 

This 8-bit binary number specifies the 
number of bytes immediately following this 
field. The value “0000 0000” is for inserting a 
single stuffing byte in a transport stream 
packet. When adaptation Jieldjcontrol = “11,” 
the value is 0-182]> When 
adaptation Jieldjcontrol = “10,” the value is 
183 d . 



Note: None of the following fields is present if 
adaptation JeldJength = “0000 0000. ” 

Discontinuity_indicator 

If this 1-bit flag is a “1,” it indicates that 
there is a discontinuity state for the current 
transport stream packet. 

Random_access_indicator 

This 1-bit flag that indicates if the current 
transport stream packet, and possibly subse- 
quent transport stream packets with the same 
PID, contain some information to aid random 
access. 

Elementary_stream_priority_indicator 

This 1-bit flag indicates, among packets 
with the same PID, the priority of the elemen- 
tary stream data carried within this transport 
stream packet. A “1” indicates that the payload 
has a higher priority than the payloads of other 
transport stream packets. A “0” indicates that 
the payload has the same priority as all other 
packets which do not have this bit set to “1.” 
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PCRJlag 

A value of “1” for this 1-bit flag indicates 
that the adaptation field contains a PCR field. A 
value of “0” indicates that the adaptation field 
does not contain any PCR field. 

OPCRjflag 

A value of “1” for this optional 1-bit flag 
indicates that the adaptation_field contains an 
OPCR field. A value of “0” indicates that the 
adaptation field does not contain any OPCR 
field. 

Splicing_point_flag 

A value of “1” for this 1-bit flag indicates 
that a splice ^countdown field is present in the 
associated adaptation field, specifying the 
occurrence of a splicing point. 

Transport_private_data_flag 

A value of “1” for this 1-bit flag indicates 
that the adaptation field contains one or more 
private_data bytes. 

Adaptation_field_extension_flag 

A value of “1” for this 1-bit flag indicates 
the presence of an adaptation field extension. 



Program_clock_reference_base 

The 33 LSBs of the optional 42-bit 
program_clock_reference field. This field is 
present if PCR Jlag = “l.” 

Reserved_bits 

These 6 optional bits have a value of “11 
1111.” This field is present if PCR Jlag = “1.” 



Program_clock_reference_extension 

This 9-bit optional field, along with the 
program_clock_reference_base field, comprises 
the 42-bit program clock reference (PCR) . PCR 
indicates the intended time of arrival of the 
byte containing the last bit of the 
program_clock_reference_base field at the input 
of the decoder. This field is present if PCR Jlag 
= “ 1 .” 



Original_program_clock_reference_base 

The 33 LSBs of the optional 42-bit 
original _program_clock_reference field. This 
field is present if OPCR Jlag = “1.” 

Reserved_bits 

These six optional bits have a value of “11 
1111.” This field is present if OPCR Jlag = “1.” 

Original_program_clock_reference_extens 

ion 

This 9-bit optional field, along with the 
original _program_clock_reference_base field, 

comprises the 42-bit original program clock 
reference (OPCR). OPCR is present only in 
transport stream packets that have a PCR. 
OPCR assists in the reconstruction of a single 
program transport stream from another trans- 
port stream. This field is present if OPCR Jlag 
= “ 1 .” 



Splice_countdown 

This 8-bit 2’s complement number repre- 
sents a value which may be positive or nega- 
tive. A positive value specifies the remaining 
number of transport stream packets, of the 
same PID, following the associated transport 
stream packet until a splicing point is reached. 
A negative number indicates that the associ- 
ated transport stream packet is the nth packet 
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following the splicing point. This field is 
present if splicing_point Jlag = “1.” 



Transport_private_data_length 

This 8-bit optional binary number [n] spec- 
ifies the number of bytes immediately follow- 
ing this field. This field is present if 
transport _private_data _Jlag= “1.” 

Private_data_byte 

These optional [n] data bytes are not speci- 
fied by the MPEG-2 standard. This field is 
present if transport Jrivatejdata _Jlag= “1.” 



Adaptation_field_extension_length 

This optional 8-bit binary number indicates 
the number of bytes of the extended adapta- 
tion field data immediately following this field, 
including reserved bytes if present. This field 
is present if adaptation Jield_extension Jlag = 
“ 1 .” 

Ltw_flag 

A “1” for this optional l-bit flag (legal time 
window flag) indicates the presence of the 
ltw_offset field. This field is present if 
adaptation _field_extension Jlag = “1.” 

Piecewise_rate_flag 

A “1” for this optional 1-bit flag indicates 
the presence of the piecewise jrate field. This 
field is present if 

adaptation Jield_extension Jlag=“l.” 

Seamless_splice_flag 

A “1” for this optional 1-bit flag indicates 
that the splice Jype and DTS_next_AU fields are 
present. This field is present if 
adaptation Jield_extension Jlag = “ 1.” 



Reserved_bits 

These five optional bits have a value of “1 
1111.” This field is present if 
adaptation Jield_extension Jlag = “l.” 



Ltw_valid_flag 

A “1” for this optional 1-bit flag indicates 
that the value of ltw_offset is valid. This field is 
present if adaptation Jield_extension Jlag = “1” 
and Itw Jlag=“l.” 

Ltw_offset 

This optional 15-bit binary number speci- 
fies the legal time window offset in units of 
(300/f s ) seconds, where f s is the system clock 
frequency of the program that this PID 
belongs to. This field is present if 
adaptation Jield_extension Jlag = “1” and 

Itw Jlag = “1.” 



Reserved_bits 

These two optional bits have a value of 
“11.” This field is present if 
adaptation Jieldjextension Jlag = “1” and 

piecewise _rate Jlag = “l.” 

Piecewise_rate 

This optional 22-bit binary number speci- 
fies a hypothetical bit-rate used to define the 
end times of the Legal Time Windows of trans- 
port stream packets of the same PID that fol- 
low this packet but do not include the 
legal_time_window_offset field. This field is 
present if adaptation Jieldjextension Jlag = “1” 
and piecewise jrate Jlag = “1.” 
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Splice_type 

This optional Tbit binary number has the 
same value in all the subsequent transport 
stream packets of the same PID in which it is 
present, until the packet in which the 
splicejcountdown reaches zero (including this 
packet). If the elementary stream carried in 
that PID is an audio stream, this field shall 
have the value “0000.” If the elementary 
stream carried in that PID is a video stream, 
this field indicates the conditions that shall be 
respected by this elementary stream for splic- 
ing purposes. This field is present if 
adaptation _field_extension Jlag = “1” and 

seamless _splice Jlag = “l.” 

DTS_next_AU[32...30] 

This optional 33-bit field indicates the 
decoding time of the first access unit following 
the splicing point. This field is present if 
adaptation Jieldjextension Jlag = “1” and 

seamless _splice Jlag = “1.” 

Marker_bit 

This optional bit always has a value of “1.” 
This field is present if 

adaptation Jield_extension Jlag = “1” and 

seamless _splice Jlag = “1.” 

DTS_next_AU[29.. . 15] 

This optional field is present if 
adaptation Jieldjextension Jlag = “1” and 

seamless _splice Jlag = “1.” 

Marker_bit 

This optional bit always has a value of “1.” 
This field is present if 

adaptation Jield_extension Jlag = “1” and 

seamless _splice Jlag=“\." 



DTS_next_AU[ 1 4. . . 0 ] 

This optional field is present if 
adaptation Jield_extension Jlag = “1” and 

seamless _splice Jlag = “1.” 

Marker_bit 

This optional bit always has a value of “1.” 
This field is present if 
adaptation Jield_extension Jlag = “1” and 

seamless _splice Jlag = “1.” 



Reserved_bits 

These optional [n] data bytes have a value 
of ‘Till 1111.” They are present if Itw Jlag, 
piecewise _rate Jlag, and seamless _splice Jlag = 
“0,” and adaptation Jield_extension Jlag = “l.” 



Stuffing_byte 

These optional [n] data bytes have a value 
of ‘Till 1111.” They are present if OPCR Jlag, 
adaptation Jieldjextension Jlag, PCR Jlag, 

transport Jrivatejdata Jlag, and 

splicing Joint Jlag = “0.” 

Program Specific Information (PSI) 

Program Specific Information (PSI) is 
additional data that enables decoders to find 
desired content more efficiently in a single 
transport stream and assemble a user-friendly 
electronic program guide (EPG) . 

Programs are composed of one or more 
packetized audio, video, data, etc. elementary 
streams (PES), each assigned a 13-bit Packet 
IDentification number (PID). In addition, 
transport stream packets carrying the same 
PES are assigned the same, but unique, PID. 

The MPEG-2 decoder can process the cor- 
rect packets only if it knows what the correct 
PIDs are. This is the function of the PSI. PSI is 
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Stream Type 


Abbreviation 


PID 


Description 


Program Association Table 


PAT 


0x0000 


Associates program number and 
Program Map Table PID 


Conditional Access Table 


CAT 


0x0001 


Associates one or more (private) 
EMM streams each with a unique 
PID value 


Transport Stream 
Description Table 


TSDT 


0x0002 


Associates one or more descriptors to an 
entire transport stream 


IPMP Control 
Information Table 


ICIT 


0x0003 


Contains IPMP tool list, rights 
container, tool container defined 
in MPEG-2.11 


Program Map Table 


PMT 


Assigned 
by PAT 


Specifies PID values for components 
of one or more programs 


Network Information Table 


NIT 


Assigned 
by PAT 1 


Physical network parameters such as 
FDM frequencies, transponder 
numbers, etc. 



Table 13.48. Program Specific Information Tables. ^PID = 0x0010 for ARIB and many DVB 
systems. 



PROGRAM ASSOCIATION TABLE (PAT) CONDITIONAL ACCESS TABLE (CAT) 

(IN TS PACKETS WITH PID = 0) (IN TS PACKETS WITH PID = 1) 




PROGRAM MAP TABLE (PMT) 

FOR PROGRAM 10 

(IN TS PACKETS WITH PID = 1017) 



PROGRAM MAP TABLE (PMT) 

FOR PROGRAM 117 

(IN TS PACKETS WITH PID = 889) 



Figure 13.24. MPEG-2 PAT and PMT Example. 
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carried in transport stream packets having 
unique PIDs so the MPEG-2 decoder can eas- 
ily find it. 

PSI has six tables, as shown in Table 13.48. 
The program association table (PAT), condi- 
tional access table (CAT), transport stream 
description table (TSDT), IPMP control infor- 
mation table (ICIT) and null packet are the 
only fixed PIDs. The MPEG-2 decoder deter- 
mines the remaining PIDs by accessing the 
appropriate table. 

Upon first receiving a transport stream, 
the MPEG-2 decoder looks for the PAT, CAT, 
TSDT, and ICIT. As shown in Figure 13.24, 
from the PAT, the PIDs of the network infor- 
mation table (NIT) and each PMT are read. 
From the PMTs, the PIDs of each elementary 
stream are read. If a program is encrypted, 
access to the CAT is also required. 

Program Association Table (PAT) 

Each transport stream contains one or 
more transport stream packets with a PID 
value of 0x0000. Together, all these packets 
make up the complete program association 
table (PAT). 

The PAT provides a complete list of all pro- 
grams within the transport stream, as illus- 
trated in Figure 13.24. Included for each 
program is the PID value of transport stream 
packets containing its corresponding program 
map table (PMT) . 

As demultiplexing is impossible without a 
PAT, the locking speed to a new program 
depends on how often the PATs are sent. 
MPEG-2 specifies a maximum of 0.5 seconds 
between PATs and any PMTs that are referred 
to in the PATs. 

The PAT may be segmented into one or 
sections before insertion into transport stream 
packets, using the following syntax: 



TableJD 

This 8-bit codeword identifies the type of 
content and has a value of 0x00 as shown in 
Table 13.49. The Code Point Registry at 
www.atsc.org provides a complete listing of 
table_ID codes. 



Description 


Code 


PAT section 


0x00 


CAT section 


0x01 


PMT section 


0x02 


TSDT section 


0x03 


MPEG-4_scene_description_section 


0x04 


MPEG-4_object_descriptor_section 


0x05 


Metadata section 


0x06 


ICIT section 


0x07 


MPEG-2 reserved 


0x08-0x38 


MPEG-2.6 

DSM-CC addressable sections 


0x39 


MPEG-2.6 DSM-CC sections 
containing multi-protocol data 


0x3A 


MPEG-2.6 DSM-CC sections 
containing U-N messages 


0x3B 


MPEG-2.6 DSM-CC sections 
containing download data messages 


0x3C 


MPEG-2.6 DSM-CC sections 
containing stream descriptors 


Qx3D 


MPEG-2.6 DSM-CC sections 
containing private data 


0x3E 


MPEG-2.6 DSM-CC 
addressable sections 


0x3F 


used by DVB 


0x40-0x7F 


used by ARIB, ATSC CA, and DVB CA 


0x80-0x8F 


used by ATSC and SCTE 


OxCO-OxFE 


forbidden 


OxFF 



Table 13.49. Common TableJD Codewords. 



Section_syntax_indicator 

The l-bit flag is always “1.” 

Reserved_bits 

These 2 bits have a value of “01.” 
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Sectionlength 

This 12-bit binary number specifies the 
number of bytes of this section, starting imme- 
diately following this field and including the 
CRC. The value in this field may not exceed 
102 Ip or 0x3FD. 

Transport_stream_ID 

This 16-bit binary number serves as a label 
to identify this transport stream from any other 
multiplex within a network. 

Reserved_bits 

These 2 bits have a value of “11.” 

Version_number 

This 5-bit binary number is the version 
number of the whole PAT The version number 
is incremented by 1 (modulo 32) whenever the 
definition of the PAT changes. 

When current_next_indicator is “1,” 
version jnumber shall be that of the currently 
applicable PAT. When current_next_indicator is 
“0,” version jnumber is the next applicable PAT. 

Current_next_indicator 

When this l-bit flag is “1,” the PAT sent is 
currently applicable. When it is “0,” the PAT 
sent is not yet applicable and shall be the next 
PAT to become valid. 

Section_number 

This 8-bit binary number specifies the 
number of this section. The section jnumber of 
the first section in the PAT must be 0x00. It is 
incremented by 1 with each additional section 
in the PAT. 



Last_section_number 

This 8-bit binary number specifies the 
number of the last section (that is, the section 
with the highest sectionjnumber) of the com- 
plete PAT. 



Note: The following four fields are repeated for each pro- 
gram. 

Program_number 

This 16-bit binary number specifies the 
program that the program _map_PID is 
assigned. 

Reserved_bits 

These three bits have a value of “111.” 

Network_PID 

This 13-bit binary number specifies the 
PID of transport stream packets that contain 
the network information table (NIT). It may 
have a value of OxOOlO-OxlFFE. This field is 
present only if program jnumber = 0x0000. 

Program_map_PID 

This 13-bit binary number specifies the 
PID of transport stream packets that contain 
the PMT section applicable for the program as 
specified by program jnumber. It may have a 
value of OxOOlO-OxlFFE. This field is present 
only if program jnumber ^ 0x0000. 

CRC_32 

32-bit CRC value. 
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Program Map Table (PMT) 

As illustrated in Figure 13.24, the program 
map table (PMT) provides the mappings 
between the program number and the pro- 
gram elements (video, audio, etc.). This is 
done by indicating the PID values of the audio, 
video, and other streams that belong to a given 
program. 

Note that for the ATSC, DVB, and OpenCa- 
ble standards, some PMTs require specific 
PIDs. Therefore, MPEG-2 and DVB/ATSC/ 
OpenCable™ bitstreams are not fully inter- 
changeable. 

The PMT has only one section identified 
by the program jnumber field, using the follow- 
ing syntax: 

Table _ID 

This 8-bit codeword identifies the type of 
content and has a value of 0x02 as shown in 
Table 13.49. The Code Point Registry at 
www.atsc.org provides a complete listing of 
table_ID codes. 

Section_syntax_indicator 

This 1-bit flag is always “1.” 

Reserved_bits 

These 3 bits have a value of “Oil.” 

Sectionlength 

This 12-bit binary number specifies the 
number of bytes of this section, starting imme- 
diately following this field and including the 
CRC. The value in this field may not exceed 
1021]3 or 0x3FD. 



Program_number 

This 16-bit binary number specifies the 
program to which the program _map_PID is 
applicable. One program definition is carried 
within one program map section. A program 
definition is never longer than 1016]3 (0x3F8). 

Reserved_bits 

These two bits have a value of “11.” 

Version_number 

This 5-bit binary number is the version 
number of the program map section. The ver- 
sion number is incremented by 1 (modulo 32) 
when a change in the information carried 
within the section occurs. 

When currentjnextjndicator is “1,” 
version jnumber is the currently applicable pro- 
gram map section. When 

currentjnextjndicator is “0,” version jnumber 
is the next applicable program map section. 

Current_next_indicator 

When this 1-bit flag is “1,” the program 
map section sent is currently applicable. When 
it is “0,” the program map section sent is not 
yet applicable and is the next program map 
section to become valid. 

Section_number 

This 8-bit binary number has a value of 
0x00. 

Last_section_number 

This 8-bit binary number has a value of 
0x00. 

Reserved_bits 

These 3 bits have a value of “111.” 
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PCR_PID 

This 13-bit binary number indicates the 
PID of the transport stream packets which 
contain PCR fields valid for the program speci- 
fied by program _number. The value of this field 
may have a value of OxOOlO-OxlFFE. If no PCR 
is associated with a program definition for pri- 
vate streams, the value is OxlFFF. 

Reserved_bits 

These four bits have a value of “1111.” 

Program_info_length 

This 12-bit binary number specifies the 
total length in bytes of the descriptors immedi- 
ately following this field. 

Descriptor_loop 

[n] descriptors may be present in this 
descriptor _loop. 

Note: The following six fields are repeated for each 
stream type present. 

Stream_type 

This 8-bit codeword specifies the type of 
program element carried within the packets 
with the PID specified by elementary _PID . The 
values of streamjype are specified in Table 
13.45. 

Reserved_bits 

These 3 bits have a value of “111.” 

Elementary_PID 

This 13-bit binary number specifies the 
PID of the transport stream packets that carry 
the associated program element. 



Reserved_bits 

These 4 bits have a value of “1111.” 

ES_info_length 

This 12-bit binary number specifies the 
total length in bytes of the descriptors immedi- 
ately following this field. 

Descriptorjoop 

[n] descriptors may be present in this 
descriptorjoop. 



CRC_32 

32-bit CRC value. 

Transport Stream Description Table 
(TSDT) 

The optional transport stream description 
table (TSDT) can be used to include descrip- 
tors that apply to an entire transport stream. 

The TSDT may be segmented into one or 
more sections before insertion into transport 
stream packets, using the following syntax: 

Table _ID 

This 8-bit codeword identifies the type of 
content and has a value of 0x03 as shown in 
Table 13.49. The Code Point Registry at 
www.atsc.org provides a complete listing of 
table_ID codes. 

Section_syntax_indicator 

This 1-bit flag is always “1.” 

Reserved_bit 

This bit has a value of “0.” 

Reserved_bits 

These 2 bits have a value of “11.” 
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Sectionlength 

This 12-bit binary number specifies the 
number of bytes of this section, starting imme- 
diately following this field and including the 
CRC. The value in this field may not exceed 
1021]3 or 0x3FD. 

Reserved_bits 

These 18 bits have a value of “11 1111 1111 
1111 1111 .” 

Version_number 

This 5-bit binary number is the version 
number of the program map section. The ver- 
sion number is incremented by 1 (modulo 32) 
when a change in the information carried 
within the section occurs. 

When currentjnextjndicator is “1,” 
version jnumber is the currently applicable pro- 
gram map section. When 

currentjnextjndicator is “0,” version _number 
is the next applicable program map section. 

Current_next_indicator 

When this l-bit flag is “1,” the program 
map section sent is currently applicable. When 
it is “0,” the program map section sent is not 
yet applicable and is the next program map 
section to become valid. 

Section_number 

This 8-bit binary number specifies the 
number of this section. The section jnumber of 
the first section must be 0x00. It is incre- 
mented by 1 with each additional section. 

Last_section_number 

This 8-bit binary number specifies the 
number of the last section (that is, the section 
with the highest section jnumber) . 



Descriptorjoop 

[n] descriptors may be present in this 
descriptorjoop. 

CRC_32 

32-bit CRC value. 

Conditional Access Table (CAT) 

The conditional access table (CAT) pro- 
vides association between one or more condi- 
tional access (CA) systems, their entitlement 
management messages (EMM), and any spe- 
cial parameters associated with them. The 
PIDs for entitlement control messages (ECM) 
and entitlement management messages 
(EMM) are included in the CAT. 

The CAT may be segmented into one or 
more sections before insertion into transport 
stream packets, using the following syntax: 

Table JD 

This 8-bit codeword identifies the type of 
content and has a value of 0x01 as shown in 
Table 13.49. The Code Point Registry at 
www.atsc.org provides a complete listing of 
table_ID codes. 

Section_syntax_indicator 

This 1-bit flag is always “1.” 

Reserved_bits 

These 3 bits have a value of “Oil.” 

Sectionlength 

This 12-bit binary number specifies the 
number of bytes of this section, starting imme- 
diately following this field and including the 
CRC. The value in this field may not exceed 
1021j) or 0x3FD. 
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Reserved_bits 

These 18 bits have a value of “11 1111 1111 
1111 1111 .” 

Version_number 

This 5-bit binary number is the version 
number of the whole CAT The version num- 
ber is incremented by 1 (modulo 32) when a 
change in the information carried within the 
CAT changes. 

When currentjnextjndicator is “1,” 
version jnumber is the currently applicable 
CAT. When currentjnextjndicator is “0,” 
version jnumber is the next applicable CAT. 

Current_next_indicator 

When this l-bit flag is “1,” the CAT sent is 
currently applicable. When it is “0,” the CAT 
sent is not yet applicable and is the next CAT 
to become valid. 

Section_number 

This 8-bit binary number indicates the 
number of this section. The sectionjnumber of 
the first section in the CAT must be 0x00. It is 
incremented by 1 with each additional section 
in the CAT. 

Last_section_number 

This 8-bit binary number specifies the 
number of the last section (that is, the section 
with the highest sectionjnumber) of the com- 
plete CAT. 

Descriptorjoop 

[n] descriptors may be present in this 
descriptorjoop. 



CRC_32 

32-bit CRC value. 

Network Information Table (NIT) 

The first entry in the PAT (program 0) is 
reserved for network data and contains the 
PID of the network information table (NIT). 
The NIT includes information of other trans- 
port streams that may be available, for exam- 
ple, by tuning to a different RF channel or 
satellite. Each transport stream may include a 
descriptor that specifies the radio frequency, 
satellite orbital position, etc. In MPEG-2, only 
the NIT is mandatory for this purpose. In DVB, 
additional metadata, known as DVB-SI, is used, 
and NIT is considered to be part of DVB-SI. 
Therefore, the term PSI/SI is a commonly 
used term. 

IPMP Control Information Table (ICIT) 

The IPMP (Intellectual Property Manage- 
ment and Protection) control information table 
contains IPMP (also known as DRM or Digital 
Rights Management) related information 
including tool list, rights container, and tool 
container. 

The tool list identifies, and enables selec- 
tion of, IPMP tools required to process con- 
tent. The tool container enables the carriage of 
binary tools in content streams. The rights con- 
tainer may contain a rights description that 
describes usage rules associated with the 
IPMP protected content. 

The IPMP stream carries all types of IPMP 
data (including key, ECM, EMM) to be deliv- 
ered to the tools. 
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Intellectual Property 
Management and Protection 
(IPMP) 

IPMP, also called digital rights manage- 
ment (DRM), provides an interface and tools, 
rather than a complete system, for implement- 
ing intellectual property rights management. 

The level and type of management and pro- 
tection provided are dependent on the value of 
the content and the business model. For this 
reason, the complete design of the IPMP sys- 
tem is left to application developers. 

The architecture enables both open and 
proprietary solutions to be used, while 
enabling interoperability, supporting the use of 
more than one type of protection (i.e., decryp- 
tion, watermarking, rights management, and 
so on) , and supporting the transferring of con- 
tent between devices using a defined inter- 
device message (reflecting the issue of content 
distribution over home networks) . 

For protected content, the IPMP tool 
requirements are communicated to the 
decoder before the presentation starts. Tool 
configuration and initialization information are 
conveyed by the IPMP Descriptor or IPMP ele- 
mentary stream. Needed tools can be embed- 
ded, downloaded, or acquired by other means. 

Control point and ordering sequence infor- 
mation in the IPMP Descriptor allows different 
tools to function at different places in the sys- 
tem. IPMP data, carried in either an IPMP 
Descriptor or IPMP elementary stream, 
includes rights containers, key containers, and 
tool initialization data. 



MPEG-4.2 Video over MPEG- 
2 Transport Streams 

Instead of MPEG-2 video, the MPEG-2 
transport or program stream can carry MPEG- 
4.2 video. This enables existing infrastructures 
and equipment to accommodate the MPEG-4.2 
video codec easily. 

The PES packet streamjd = “1110 xxxx” 
for MPEG-4.2 video. Streamjype = 0x10 within 
the PMT or PSM. Carriage of the MPEG-4.2 
stream must also be signaled by using the 
MPEG-4 Video Descriptor. 

MPEG-4.3 audio, MPEG-4 SL-packetized 
streams, and MPEG-4 FlexMux streams can 
also be carried by an MPEG-2 transport or pro- 
gram stream. 

MPEG-4.10 (H.264) Video 
over MPEG-2 Transport 
Streams 

Instead of MPEG-2 video, MPEG-2 trans- 
port and program streams can carry MPEG- 
4.10 (H.264) video in PES packets. This 
enables existing infrastructures and equip- 
ment to accommodate the H.264 video codec 
easily. 

The PES packet streamjd = “1110 xxxx” 
for MPEG-4.10 (H.264) video. Streamjype = 
OxlB within the PMT or PSM. Carriage of the 
H.264 stream must need to be signaled by 
using the MPEG-2 AVC Video Descriptor and/ 
or the MPEG-2 AVC Timing and HRD Descrip- 
tor. 
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SMPTE 421M (VC-1) Video 
over MPEG-2 Transport 
Streams 

Instead of MPEG-2 video, MPEG-2 trans- 
port and program streams can carry SMPTE 
421M (VC-1) video in PES packets. This 
enables existing infrastructures and equip- 
ment to easily accommodate the SMPTE 421M 
(VC-1) video codec. 

The PES packet stream_ID = OxFD and 
stream _ID _extension = “101 0101” for SMPTE 
421M video. Stream Jype = OxEA within the 
PMT or PSM. Carriage of the SMPTE 421M 
stream must also be signaled by using one or 
more MPEG-2 Registration Descriptors; each 
descriptor may contain one optional SMPTE 
421M subdescriptor in the 
additional -identification Jnfo field. 

A SMPTE 421M Profile and Level Subde- 
scriptor may be used to indicate the profile and 
level of the SMPTE 421M stream. A SMPTE 
421M Alignment Subdescriptor may be used to 
define which type of alignment exists between 
the coded byte sequence and a PES packet. A 
SMPTE 421M Buffer Size Subdescriptor may 
be used to specify the minimum elementary 
stream buffer size needed in the decoder to 
decode the SMPTE 42 1M stream. 

For Simple or Main profile streams, VC- 
l_SPMP_PESpacket_PayloadFormatHeader() 
must be present at the beginning of every 
access unit. 



MPEG-2 PMT/PSM 
Descriptors 

These MPEG-2 descriptors are used to 
identify commonly used private (non-MPEG-2) 
information that is present in the MPEG-2 
transport or program stream. 

A descriptor is typically contained within a 
descriptor _loop in the MPEG-2 PMT or PSM. 
The general format of a descriptor is: 

descriptor_tag (8 bits) 
descriptor_length (8 bits) 
data 

Descriptor tag values of 0, 1, and 44-63 are 
reserved. Values of 19-26 are reserved for 
MPEG-2.6 data. Unless otherwise indicated, 
descriptors may be present in both transport 
and program streams. 

MPEG-2 A AC Audio Descriptor 

For individual MPEG-2.7 audio streams 
carried in PES packets, this MPEG-2 descrip- 
tor provides basic information for identifying 
the coding parameters of such audio elemen- 
tary streams. 

Descriptor_tag 

This 8-bit field has a value of “0010 1011.” 
Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0011.” 

MPEG-2_AAC_profile 

This 8-bit field identifies AAC profile per 
the MPEG-2.7 specification. 
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MPEG-2_AAC_channel_configuration 

This 8-bit field identifies the number and 
configuration of the audio channels. 

MPEG-2_AAC_additional_information 

This 8-bit field identifies whether or not 
bandwidth extension data is embedded in the 
audio stream. 

Audio Stream Descriptor 

This MPEG-2 descriptor provides basic 
information which identifies the coding ver- 
sion of an audio elementary stream. 

Descriptor_tag 

This 8-bit field has a value of “0000 0011.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0001.” 

Free Jormat Jlag 

A “1” for this bit indicates the bitratejndex 
field in the audio stream is “0000.” 

ID 

This bit is set to the same value as the ID 
field in the audio stream. 

Layer 

This 2-bit binary number is set to the same 
or higher value as the highest layer in any 
audio stream. 

Variable_rate_audio_indicator 

A “0” for this bit indicates that the bit-rate 
of the audio stream does not vary between 
audio frames. 



Reserved_bits 

These 3 bits are always “111.” 

A VC Timing and HRD Descriptor 

For individual MPEG-4.10 (H.264) video 
streams carried in PES packets, this MPEG-2 
descriptor describes the video stream time 
information and hypothetical reference 
decoder (HRD) information. When the H.264 
video stream does not convey the H.264 video 
usability information (VUI) parameter, this 
descriptor must be present in the PMT. 

Descriptor_tag 

This 8-bit field has a value of “0010 1010.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

HRD_management_valid_flag 

A “1” for this bit indicates the buffering 
period SEI (see H.264) needs to be contained 
in the H.264 video stream. A “0” indicates the 
MPEG-2 leak method should be used. 

Reserved_bits 

These 6 bits are always “11 1111.” 

Picture_and_timing_info_present 

A “1” for this bit indicates this descriptor 
contains 90kHz Jlag and parameters for map- 
ping to the system clock. 



90kHz Jlag 

A “1” for this bit indicates the H.264 time 
base is 90 kHz. A “0” indicates the N and K 
fields are present. This field is present only if 
picture_and_timing_info _present = “1.” 
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Reserved_bits 

These 7 bits are always “111 1111.” This 
field is present only if 
picture _and_timing_info _present = “1.” 



N, K 

These two 32-bit fields describe the rela- 
tionship between H.264’s time_scale and 
system_clock_reference. These fields are 
present only if picturejand_timing_info_present 
= “1” and 90kHz _fiag = “0.” 



N um_units_in_tick 

See H.264 for the definition of this 32-bit 
field. This field is present only if 
picture _and_timing_info _present = “1.” 



Fixed_ffame_rate_flag 

A “1” for this bit indicates the H.264 coded 
video frame rate is constant. A “0” indicates 
there is no information regarding the frame 
rate within the descriptor. 

Temporal_poc_flag 

A “1” for both this bit and 

fixed ' Jramejrate _Jlag indicates the H.264 video 
stream must convey the picture order count 
(POC) information. A “0” indicates there is no 
information regarding the relationship 

between the POC information of the H.264 
video stream and time. 

Picture_to_display_conversion_flag 

A “1” for this bit indicates the H.264 video 
stream contains information on displaying 
coded pictures. When this bit is “0,” the 
pic_struct J>resent Jlag of the H.264 video 
stream must be set to “0”. 



Reserved_bits 

These 5 bits are always “1 1111.” 

A VC Video Descriptor 

For individual MPEG-4.10 (H.264) video 
streams carried in PES packets, this MPEG-2 
descriptor describes the coding parameters of 
the video stream. When this descriptor is not 
present in the PMT, the stream should not con- 
tain H.264 still images or H.264 24-hour pic- 
tures. 

Descriptor_tag 

This 8-bit field has a value of “0010 1000.” 

Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Profile_IDC 

This 8-bit field identifies the profile of the 
H.264 video stream per the H.264 specification. 

Constraint_setO_flag 

This l-bit field is specified in the H.264 
specification. 

Constraint_setl _flag 

This 1-bit field is specified in the H.264 
specification. 

Constraint_set2_flag 

This 1-bit field is specified in the H.264 
specification. 

AVC_compatible_flags 

This 5-bit field has the same value as 
reserved _zero_5bits in the sequence parameter 
set in the H.264 specification. 
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Level_IDC 

This 8-bit field identifies the level of the 
MPEG-4.10 (H.264) video stream per the 
MPEG-4.10 (H.264) specification. 

AVC_still_present 

A “1” for this l-bit flag indicates the H.264 
video stream contains H.264 still images. A “0” 
indicates the H.264 video stream should not 
contain H.264 still images. 

AVC_24_hour_picture_flag 

A “1” for this 1-bit flag indicates the H.264 
video stream contains 24-hour pictures, which 
are access units having presentation times 
exceeding 24 hours. A “0” indicates the H.264 
video stream should not contain 24-hour pic- 
tures. 

Reserved_bits 

These 6 bits are always “11 1111.” 

CA Descriptor 

This MPEG-2 descriptor indicates the 
PIDs of transport stream packets which con- 
tain ECM, EMM, or SRM information. If 
present in CAT, then a system-wide conditional 
access management system exists. If present 
in PMT, CA_PID points to packets containing 
program-related access control information 
(ECM). 

Descriptor_tag 

This 8-bit field has a value of “0000 1001.” 

Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. 



CA_system_ID 

This 16-bit binary number specifies the 
type of conditional access used. The coding of 
this field is privately defined. A value of 
0x4ADD identifies it as being an ATSC SRM 
Reference Descriptor. 

Reserved_bits 

These 3 bits are always “111.” 

CA_PID or SRM_PID 

This 13-bit binary number specifies the 
PID of the transport stream packets which 
contain either ECM, EMM, or SRM informa- 
tion for the conditional access system specified 
by CA_system_ID. 

For transport streams, a value of 0x0003 
indicates that there is IPMP used by compo- 
nents in the transport stream. For program 
streams, the presence of stream _ID_extension 
value 0x00 indicates that IPMP is used by com- 
ponents in the program stream. 

Private_data_byte 

These optional [n] bytes of private data are 
defined by the CA owner. 

Caption Service Descriptor 

For OpenCable™ and generic MPEG-2 
decoders, this CEA-708 descriptor must be 
present for each program that has closed cap- 
tioning. For ATSC and OpenCable™, it must 
also present in the descriptor Joop of the event 
information table (EIT) . 

Up to sixteen individual service descrip- 
tions are supported, with each service descrip- 
tion being 6 bytes in length. 

Descriptor_tag 

This 8-bit field has a value of “1000 0110.” 
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Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. 

Reserved_bits 

These 3 bits are always “111.” 

Number_of_services 

This 5-bit field has a range of 1-16 to indi- 
cate the number of captioning services 
present. 



Note: [number _of_services] specifies how many times the 
following nine fields are repeated. 

Language 

This 3-byte code specifies the language 
associated with the caption service, per ISO 
639.2/B. 

Digital_cc 

A “1” for this l-bit flag indicates CEA-708 
captioning service is present. A “0” indicates 
CEA-608 captioning is present. 

Reserved_bit 

This bit is always “1.” 



Reserved_bits 

These 5 optional bits are always “1 1111.” 
They are present only if captionjype = “0.” 

Line21_field 

A “1” for this optional 1-bit flag indicates 
that CEA-608 captioning for field 2 is present. 
A “0” indicates CEA-608 captioning for field 1 is 
present. This bit is present only if captionjype 
= “ 0 .” 



Caption_service_number 

This optional 6-bit field has a range of 1-63 
to indicate the service number of the caption- 
ing stream. These bits are present only if 
captionjype = “1.” 



Easy_reader 

A “1” for this 1-bit flag indicates that the 
caption service contains text formatted for 
beginning readers. A “0” indicates that the cap- 
tion service is not tailored for this. 

Wide_aspect_ratio 

A “1” for this 1-bit flag indicates that the 
caption service is formatted for 16:9 displays. A 
“0” indicates that the caption service is format- 
ted for 4:3 displays, and may be optionally dis- 
played centered on 16:9 displays. 

Reserved_bits 

These 14 bits are always “11 1111 1111 
1111 .” 

Copyright Descriptor 

This MPEG-2 descriptor provides a 
method to enable audio-visual works identifica- 
tion. For the DVB standard, this descriptor is 
optional, and may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit binary number has a value of 
“0000 1101 .” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 
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Video Stream 


Audio Stream 


Code 


Alignment Type 


Alignment Type 


reserved 


reserved 


0000 0000 


slice, picture, GOP, or sequence 


sync word 


0000 0001 


picture, GOP, or sequence 




0000 0010 


GOP or sequence 




0000 0011 


sequence 


reserved 


0000 0100 




0000 0101 


reserved 










1111 1111 



Table 13.50. alignment_type Codewords. 



CopyrightJD 

This 32-bit value is obtained from the Reg- 
istration Authority. 

Additional_copyright_info 

These optional [n] bytes of data are 
defined by the copyright owner and are never 
changed. 

Data Stream Alignment Descriptor 

This MPEG-2 descriptor describes the 
alignment of video stream syntax with respect 
to the start of the PES packet payload. ATSC 
requires this descriptor to be present in the 
program element loop of the PMT section that 
describes the video elementary stream. For 
the DVB standard, this descriptor is optional, 
and may be ignored by the decoder if present. 

Descriptor_tag 

This 8-bit binary number has a value of 
“0000 0110 .” 



Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0001.” 

Alignment_type 

This 8-bit codeword specifies the audio or 
video alignment type as shown in Table 13.50. 
For the ATSC standard, the value must be 
“0000 0010 .” 

DTCP Descriptor 

This descriptor is used to control HDCP- 
and DTCP-protected digital outputs, such as 
IEEE 1394, USB, HDMI, and IP networks. 

Descriptor_tag 

This 8-bit binary number has a value of 
“1000 1000 .” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 
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CA_system_ID 

This 16-bit binary number specifies the 
type of conditional access used. It has a value 
of OxOFFF (DTLA). 

Reserved_bits 

These 5 bits are always “1 1111.” 

EPN 

A “1” for this 1-bit Encryption Plus Non- 
Assertion flag specifies that content not other- 
wise copy controlled is not to be retransmitted 
over the Internet. 

DTCP_CCI 

This 2-bit codeword specifies the digital 
copy generation management: 

00 = copy free 

01 = no more copies 

10 = copy one generation 

11 = copy never 

Reserved_bits 

These 5 bits are always “1 1111.” 

Image_constrain_token 

A “1” for this 1-bit flag specifies that high- 
definition content must be constrained to 
520,000 total samples or less (960 x 540p, for 
example) when output onto unprotected high- 
definition analog video outputs. 

APS 

This 2-bit codeword specifies the Analog 
Protection Service (APS) . 

00 = no Analog Protection Service 

01 = PSP on, color striping off 

10 = PSP on, 2-line color striping on 

11 = PSP on, 4-line color striping on 



DTS Audio Descriptor 

PES packets containing DTS® audio may 
be included in a MPEG-2 program or transport 
stream in the same way as MPEG or Dolby® 
Digital audio can be included. 

MPEG-2 does not explicitly support a 
DTS® bitstream. Also, the MPEG-2 audio 
stream descriptor is inadequate to describe the 
contents of the DTS® bitstream in the PSI 
tables. 

Therefore, PES packets containing DTS® 
audio data are sent using private stream 1. A 
Registration Descriptor {descriptor Jag = “0000 
0101”) and DTS Audio Descriptor 
{descriptor Jag = “1001 0001” or “0111 0011”) 
are also required. 

Hierarchy Descriptor 

This MPEG-2 descriptor provides informa- 
tion to identify the program elements contain- 
ing components of hierarchically coded video, 
audio, and private streams. 

Descriptor_tag 

This 8-bit field has a value of “0000 0100.” 

Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Reserved_bits 

These 4 bits are always “1111.” 

Hierarchy_type 

This 4-bit codeword indicates the relation- 
ship between the hierarchy layer and its 
embedded layer as shown in Table 13.51. 
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Hierarchy Type 


Code 


reserved 


0000 


spatial scalability 


0001 


SNR scalability 


0010 


temporal scalability 


0011 


data partitioning 


0100 


extension bitstream 


0101 


private bitstream 


0110 


multi-view profile 


0111 


reserved 


1000-1110 


base layer 


1111 



Table 13.51. hierarchy _type Codewords. 



Reserved_bits 

These 2 bits are always “11.” 

Hierarchy_layer_index 

This 6-bit binary number indicates a 
unique index of the stream. 

Reserved_bits 

These 2 bits are always “11.” 

Hierarchy_embedded_layer_index 

This 6-bit binary number defines the hier- 
archical table index of the stream that must be 
accessed prior to decoding. This parameter is 
undefined for a hierarchyjype value of “1111.” 

Reserved_bits 

These 2 bits are always “11.” 

Hierarchy_channel 

This 6-bit binary number indicates the 
intended channel number. The most robust 
channel has the lowest value. 



IBP Descriptor 

This MPEG-2 descriptor provides informa- 
tion about some characteristics of the 
sequence of frame types in the video sequence. 
For the DVB standard, this descriptor is 
optional, and may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit field has a value of “0001 0010.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0010.” 

Closed_gop_flag 

A “1” for this l-bit flag indicates that a 
group of pictures header is encoded before 
every I-frame and that the closed_gop flag is set 
to “1” in all group of pictures headers in the 
video sequence. 

Identical_gop_flag 

A “1” for this 1-bit flag indicates the num- 
ber of P-frames and B-frames between I- 
frames, and the picture coding types and 
sequence of picture types between I-pictures, 
is the same throughout the sequence, except 
possibly for pictures up to the second I-picture. 

Max_gop_length 

This 14-bit binary number indicates the 
maximum number of the coded pictures 
between any two consecutive I-pictures in the 
sequence. The value zero may not be used. 





MPEG-2 PMT/PSM Descriptors 683 



IPMP Descriptor 

This MPEG-2 descriptor signals IPMP tool 
protection, associating IPMP tools with each 
protected program and indicating the control 
point at which a specific IPMP tool should be 
running. 

The MPEG-2 IPMP descriptor has a 
descriptor Jag value of “0010 1001.” 

ISO 639 Language Descriptor 

This MPEG-2 descriptor provides a 
method to indicate the language (s) of each 
audio elementary stream. If present, ATSC 
requires this descriptor to be in descriptor Joop 
following ESJnfoJength in the PMT for each 
Dolby® Digital or Dolby® Digital Plus audio 
elementary stream. For the DVB standard, 
this descriptor must be present and decoded if 
more than one audio (or video) stream with 
different languages is used for a program. 

Descriptor_tag 

This 8-bit field has a value of “0000 1010.” 

Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following two fields are present for each lan- 
guage. 

ISO_639_language_code 

This 24-bit field contains a 3-character lan- 
guage code. 



Audio_type 

This 8-bit codeword identifies the audio 
type: 

0x00 = reserved 

0x01 = clean effects (no language) 

0x02 = hearing impaired 

0x03 = visual impaired commentary 

0x04-0xFF = reserved 



Maximum Bit-Rate Descriptor 

This MPEG-2 descriptor provides a 
method to indicate information about the maxi- 
mum bit-rate present. It only applies to trans- 
port streams, not program streams. For the 
DVB standard, this descriptor is optional, and 
may be ignored by the decoder if present. 

Descriptor_tag 

This 8-bit field has a value of “0000 1110.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0011.” 

Reserved_bits 

These 2 bits are always “11.” 

Maximum_bitrate 

This 22-bit binary number indicates the 
maximum bit-rate present, in units of 50 bytes 
per second. 

Metadata Descriptors 

Various MPEG-2 descriptors are available 
to enable including metadata information that 
describes audiovisual content and data 



essence. 
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MPEG-2 defines two tools for synchronous 
delivery of the metadata: 

PES packet payload 

DSM-CC synchronized download protocol 

In addition, MPEG-2 defines three tools for 
asynchronous delivery of metadata: 

Carriage in metadata sections 
DSM-CC data carousels 
DSM-CC object carousels 



Content Labeling Descriptor 

This MPEG-2 descriptor assigns a label to 
content; the label can be used by metadata to 
reference the associated content. It also pro- 
vides information on which content time base 
is used and on the offset between the content 
time base and the metadata time base. 

This descriptor has a descriptor Jag value 
of “0010 0100.” 

Metadata Pointer Descriptor 

This MPEG-2 descriptor points to one 
metadata service and associates its service 
with audiovisual content in an MPEG-2 stream. 

This descriptor has a descriptor Jag value 
of “0010 0101.” 

Metadata Descriptor 

This MPEG-2 descriptor specifies the for- 
mat of the associated metadata carried in pro- 
gram or transport stream. It can also convey 
information to identify the metadata service 
from a collection of metadata transmitted in a 
DSM-CC carousel. 

This descriptor has a descriptor Jag value 
of “0010 0110.” 



Metadata STD Descriptor 

This MPEG-2 descriptor defines parame- 
ters of the standard model for the processing 
of the metadata stream. 

This descriptor has a descriptor Jag value 
of “0010 0111.” 

Multiplex Buffer Utilization Descriptor 

This MPEG-2 descriptor provides bounds 
on the occupancy of the STD multiplex buffer. 
For the DVB standard, this descriptor is 
optional, and may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit field has a value of “0000 1100.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Bound_valid_flag 

A “1” for this bit indicates 
LTW_offsetJower_bound and 

LTW_offset_upper_bound are valid. 

LTW_offset_lower_bound 

This 15-bit binary number is in units of (27 
MHz/300) clock periods. It specifies the low- 
est value any LTW_offset field will have. 

Reserved_bit 

This bit is always a “1.” 

LTW_offset_upper_bound 

This 15-bit binary number is in units of (27 
MHz/300) clock periods. It specifies the upper 
value any LTW_offset field will have. 
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Private Data Indicator Descriptor 

This MPEG-2 descriptor provides a 
method for carrying private information. For 
the DVB standard, this MPEG-2 descriptor is 
optional, and may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit field has a value of “0000 1111.” 
Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Private_data_indicator 

These 32 bits are private data and are not 
defined by the MPEG-2 specification. 

Registration Descriptor 

This MPEG-2 descriptor provides a 
method to uniquely identify formats of private 
data. 

Programs that conform to ATSC are identi- 
fied by this descriptor in descriptor _loop after 
program _info_length of the PMT. 

This descriptor is also placed in 
descriptor Joop after ESJnfoJength of the PMT 
for each program element having a stream_type 
value in the ATSC-user private range (0xC4 to 
OxFF) , to establish the private entity associated 
with that program element. 

For the DVB standard, this descriptor is 
optional, and may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit field has a value of “0000 0101.” 



Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. 

Format_identifier 

This 32-bit value is obtained from the 
SMPTE Registration Authority, with some 
common ones illustrated in Table 13.52. 



Format 


Code 


ATSC 


0x47413934 


AVS (GB/T 20090.2-2006) 


0x41565356 


Dolby Digital audio 


0x41432D33 


DTS audio (512 frame) 


0x44545331 


DTS audio (1024 frame) 


0x44545332 


DTS audio (2048 frame) 


0x44545333 


SCTE 54 standard 


0x53435445 


SMPTE 42 1M (VC-1) video 


0x56432D31 



Table 13.52. formatjdentifier Codewords. 

Additional_identification_info 

These optional [n] bytes of data are 
defined by the registration owner and are 
never changed. 

Smoothing Buffer Descriptor 

This MPEG-2 descriptor conveys informa- 
tion about the size of a smoothing buffer asso- 
ciated with this descriptor and the associated 
leak rate out of that buffer. ATSC requires the 
PMT to have a Smoothing Buffer Descriptor 
pertaining to that program. For the DVB stan- 
dard, this descriptor is recommended, but may 
be ignored by the decoder if present. 

Descriptor_tag 

This 8-bit field has a value of “0001 0000.” 
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Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0110.” 

Reserved_bits 

These 2 bits are always “11.” 

Sb_leak_rate 

This 22-bit binary number specifies the 
value of the leak rate out of the buffer for the 
associated elementary stream or other data in 
units of 400 bps. 

Reserved_bits 

These 2 bits are always “11.” 

Sb_size 

This 22-bit binary number specifies the 
smoothing buffer size for the associated ele- 
mentary stream or other data in l-byte units. 
For ATSC and OpenCable™, this field has a 
value <2048. 

STD Descriptor 

This MPEG-2 descriptor only applies to 
transport streams, not program streams. For 
the DVB standard, this descriptor may be 
ignored by the decoder. 

Descriptor_tag 

This 8-bit field has a value of “0001 0001.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0001.” 

Reserved_bits 

These 7 bits are always “111 1111.” 



Leak_valid_flag 

This 1-bit flag specifies the technique used 
to transfer data between memory buffers. 

System Clock Descriptor 

This MPEG-2 descriptor provides a 
method to indicate information about the sys- 
tem clock used to generate timestamps. For 
the DVB standard, this descriptor is recom- 
mended, but may be ignored by the decoder if 
present. 

Descriptor_tag 

This 8-bit field has a value of “0000 1011.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0010.” 

External_clock_reference jndicator 

A “1” for this bit indicates that the system 
clock was derived from a reference that may 
be available at the decoder. 

Reservedjnt 

This bit is always a “1.” 

Clock_accuracyjnteger 

Combined with clock _accuracy_exponent, 
this 6-bit binary number provides the fractional 
frequency accuracy of the system clock. 

Clock_accuracy_exponent 

Combined with clock _accuracy_integer, this 
3-bit binary number provides the fractional fre- 
quency accuracy of the system clock. 

Reserved_bits 

These 5 bits are always “1 1111.” 
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Target Background Grid Descriptor 

This MPEG-2 descriptor provides display- 
ing the video within a specified location of the 
display. It is useful when the video is not 
intended to use the full area of the display. For 
the DVB standard, this descriptor is required 
when the resolution is greater than 720 x 576 
(25 Hz bitstream) or 720 x 480 (30 Hz bit- 
stream) . 

Descriptor_tag 

This 8-bit field has a value of “0000 0111.” 
Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Horizontal_size 

This 14-bit binary number specifies the 
horizontal size of the target background grid in 
samples. 

Vertical_size 

This 14-bit binary number specifies the 
vertical size of the target background grid in 
lines. 

Aspect_ratio_information 

This 4-bit codeword specifies the aspect 
ratio as defined in the video sequence header. 

Video Stream Descriptor 

This MPEG-2 descriptor provides basic 
information which identifies the coding param- 
eters of a video elementary stream. 

Descriptor_tag 

This 8-bit field has a value of “0000 0010.” 



Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Multiple Jrame_rate Jlag 

A “1” for this bit indicates multiple frame 
rates may be present in the video stream. 

Frame_rate_code 

This 4-bit codeword indicates the video 
frame rate, as shown in Table 13.53. When 
multiple Jrame_rate Jlag is a “1,” the indication 
of a specific frame rate also allows other frame 
rates to be present in the video stream. 

MPEG_l_only_flag 

A “1” for this bit indicates the video stream 
contains only MPEG-1 video data. 

Constrained_parameter_flag 

If MPEG_l_only Jlag is a “0,” this bit must 
be a “1.” If MPEG _l_only Jlag is a “1,” this bit 
reflects the value of the 
constrained Jarameter Jlag in the MPEG-1 
video stream. 

Still_picture_flag 

A “1” for this bit indicates the video stream 
contains only still pictures. A “0” indicates the 
video stream may have either still or moving 
pictures. 

Profile_andJeveljndication 

This optional 8-bit codeword reflects the 
same or higher profile and level as indicated by 
the profile_andJevelJndication field in the 
MPEG-2 video stream. This field is present 
only if MPEG_l_only Jlag = “0.” 




688 Chapter 13: MPEG-2 



Indicated Frame 
Rate 


May Also Include 
These Frame Rates 


Code 


forbidden 




0000 


23.976 




0001 


24.0 


23.976 


0010 


25.0 




0011 


29.97 


23.976 


0100 


30.0 


23.976, 24.0, 29.97 


0101 


50.0 


25.0 


0110 


59.94 


23.976, 29.97 


0111 


60.0 


23.976, 24.0, 29.97, 30.0, 59.94 


1000 


reserved 




1001 








reserved 




1111 



Table 13.53. frame _rate_code Codewords. 



Chroma_format 

This optional 2-bit codeword reflects the 
same or higher chroma format as indicated by 
the chroma Jormat field in the MPEG-2 video 
stream. This field is present only if 
MPEG_l_only Jlag = “0.” 

Frame_rate_extension_flag 

A “1” for this optional bit indicates that 
either or both of the frame_rate_extension_n 
and frame_rate_extension_d fields in any 
MPEG-2 video stream are non-zero. This field 
is present only if MPEG_l_only Jlag = “0.” 

Reserved_bits 

These five optional bits are always “1 
1111.” This field is present only if 
MPEG_l_only Jlag = “0.” 



Video Window Descriptor 

This MPEG-2 descriptor provides a 
method to indicate information about the asso- 
ciated video elementary stream. For the DVB 
standard, if this field is present, the decoder 
must process the data. 

Descriptor_tag 

This 8-bit field has a value of “0000 1000.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0100.” 

Horizontal_offset 

This 14-bit binary number indicates the 
horizontal position of the top left pixel of the 
video window on the target background grid. 
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Vertical_offset 

This 14-bit binary number indicates the 
vertical position of the top left pixel of the 
video window on the target background grid. 

Window_priority 

This 4-bit binary number indicates how 
video windows overlap. A value of “0000” is 
lowest priority and “1111” is highest priority. 
Higher priority windows are visible over lower 
priority windows. 

MPEG-4 PMT/PSM 
Descriptors 

These MPEG-4 descriptors are used to 
identify MPEG-4-specific information that is 
present in the MPEG-2 transport or program 
stream. They are typically contained within a 
descriptor _loop in the MPEG-2 PMT or PSM, 
and may also be present in other MPEG-4-spe- 
cific tables. 

MPEG-4 Audio Descriptor 

For individual MPEG-4.3 streams carried 
in PES packets, this MPEG-4 descriptor pro- 
vides basic information for identifying the cod- 
ing parameters. It does not apply to MPEG-4.3 
streams encapsulated in MPEG-4 SL-packets 
or FlexMux packets. 

Descriptor_tag 

This 8-bit field has a value of “0001 1100.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0001.” 



MPEG-4_audio_profile_and_level 

This 8-bit field identifies the profile and 
level of the MPEG-4.3 audio stream. 

External ES ID Descriptor 

This MPEG-4 descriptor assigns an ES_ID, 
defined in MPEG-4.1, to a program element 
which has no ES_ID value. ES_ID allows refer- 
ence to a non-MPEG-4 component in the scene 
description or to associate a non-MPEG-4 com- 
ponent within an IPMP stream. 

For a transport stream, this descriptor is in 
descriptor Joop after ESJnfo Jength within the 
PMT. For a program stream, within the PSM, 
this descriptor is in descriptor Joop after 
elemen tary_streamJnfoJength . 

Descriptor_tag 

This 8-bit field has a value of “0010 0000.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0010.” 

External_ESJD 

This 16-bit field assigns ES_ID, as defined 
in MPEG-4.1, to a component of a program. 

FMC Descriptor 

This MPEG-4 descriptor indicates that the 
MPEG-4 FlexMux tool has been used to multi- 
plex MPEG-4 SL-packetized streams into a 
FlexMux stream before encapsulation in 
MPEG-2 PES packets or MPEG-4 sections. It 
associates FlexMux channels to the ES_ID val- 
ues of the SL-packetized streams in the 
FlexMux stream. An FMC Descriptor is 
required for each program element referenced 
by an elementary _PID value in a transport 
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stream and for each elementary _str earn _ID in a 
program stream that conveys a FlexMux 
stream. 

For a transport stream, this descriptor is in 
the descriptorjoop after ESJnfoJength within 
the PMT. For a program stream, within the 
PSM, this descriptor is in descriptorjoop after 
elementary _streamJnfoJength. 

Descriptor_tag 

This 8-bit field has a value of “0001 1111.” 
Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

ES_ID 

This 16-bit field specifies the identifier of a 
SL-packetized stream. 

FlexMuxChannel 

This 8-bit field specifies the number of the 
FlexMux channel used for this SL-packetized 
stream. 

FmxBufferSize Descriptor 

This MPEG-4 descriptor conveys the size 
of the FlexMux buffer for each MPEG-4 SL- 
packetized stream multiplexed in an MPEG-4 
FlexMux stream. One FmxBufferSize Descrip- 
tor is associated with each elementary _PID or 
elementary _stream_ID conveying a FlexMux 
stream. 

For a transport stream, this descriptor is in 
descriptorjoop after ESJnfoJength within the 
PMT. For a program stream within the PSM, 
this descriptor is in descriptorjoop after 
elementary _stream Jnfo Jength. 

Descriptor_tag 

This 8-bit field has a value of “0010 0010.” 



Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

DefaultFlexMuxBufferDescriptor() 

This descriptor specifies the default 
FlexMux buffer size for this FlexMux stream. 
It is defined in MPEG-4. 1. 

FlexMux Buffer Descriptor() 

This descriptor specifies the FlexMux 
buffer size for one SL-packetized stream car- 
ried within the FlexMux stream. It is defined in 
MPEG-4. 1. 

IOD Descriptor 

This MPEG-4 descriptor encapsulates the 
MPEG-4.1 InitialObjectDescriptor structure. It 
allows access to MPEG-4 streams by identify- 
ing the ES_ID values of the MPEG-4.1 scene 
description and object descriptor streams. 
Both contain further information about the 
MPEG-4 streams that are part of the scene. 

For a transport stream, this descriptor is in 
descriptorjoop after program Jnfo Jength 
within the PMT. For a program stream, within 
the PSM, this descriptor is in descriptorjoop 
after program _stream Jnfo Jength. 

Descriptor_tag 

This 8-bit field has a value of “0001 1101.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Scope_ofJODJabel 

This 8-bit field specifies the scope of the 
IODJabel field. A value of 0x10 indicates that 
the IODJabel is unique within the program 
stream or program in a transport stream. A 
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value of Oxll indicates that the IODJabel is 
unique within the transport stream in which 
the IOD descriptor is carried. All other values 
are reserved. 

IODJabel 

This 8-bit field specifies the label of the 
IOD descriptor. 

Initial Object Descriptor() 

This structure is defined in MPEG-4.1. 

MultiplexBuffer Descriptor 

This MPEG-4 descriptor is associated to 
each elementary _PID that contains an MPEG-4 
FlexMux or SL-packetized stream, including 
those containing MPEG-4 sections. 

This descriptor only applies to transport 
streams, not program streams. For a transport 
stream, this descriptor is in descriptor _loop 
after ESJnfoJength within the PMT. 

Descriptor Jag 

This 8-bit field has a value of “0010 0011.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0110.” 

MBJ>uffer_size 

This 24-bit field specifies the size (in bytes) 
of buffer [MB] of the elementary stream. 

TBJeak_rate 

This 24-bit field specifies the rate (in units 
of 400 bps) at which data is transferred from 
transport buffer [TB] to multiplex buffer [MB] 
for the elementary stream. 



Muxcode Descriptor 

This MPEG-4 descriptor conveys MuxCo- 
deTableEntry structures as defined in MPEG- 
4.1. MuxCodeTableEntries configure the Mux- 
Code mode of FlexMux. One or more Mux- 
code Descriptors may be associated with each 
elementary _PID or elementary _str earn _ID con- 
veying an MPEG-4 FlexMux stream that uti- 
lizes the MuxCode mode. 

Within a transport stream, this descriptor 
is in descriptorjoop after ESJnfoJength in the 
PMT. Within a program stream, within the 
PSM, this descriptor is in descriptorjoop after 
elementary _streamJnfo Jength. 

Descriptor Jag 

This 8-bit field has a value of “0010 0001.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

MuxCodeTableEntry() 

This structure is defined in MPEG-4.1. 

SL Descriptor 

This MPEG-4 descriptor is used when a 
single MPEG-4 SL-packetized stream is encap- 
sulated in MPEG-2 PES packets. It associates 
the ESJD of the SL-packetized stream to an 
elementary _PID or elementary _str earn JD. 

This descriptor only applies to transport 
streams, not program streams. Within a trans- 
port stream, this descriptor is in 
descriptorjoop after ESJnfoJength within the 
PMT. 

Descriptor Jag 

This 8-bit field has a value of “0001 1110.” 
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Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0010.” 

ES_ID 

This 16-bit field specifies the identifier of a 
SL-packetized stream. 

MPEG-4 Video Descriptor 

For MPEG-4.2 streams carried in PES 
packets, this MPEG-4 video descriptor pro- 
vides basic information for identifying the cod- 
ing parameters. It does not apply to MPEG-4.2 
streams encapsulated in SL-packets or 
FlexMux packets. 

Descriptor_tag 

This 8-bit field has a value of “0001 1011.” 

Descriptor_length 

This 8-bit binary number specifies the 
number of bytes following this field. It has a 
value of “0000 0001.” 

MPEG-4_visual_profile_and_level 

This 8-bit field identifies the profile and 
level of the MPEG-4.2 video stream. It has the 
same value as profile jand_level_indication in 
the Visual Object sequence header in the asso- 
ciated MPEG-4.2 stream. 



ARIB PMT Descriptors 

These ARIB descriptors are used to iden- 
tify ARIB-specific information that is present in 
the MPEG-2 transport stream. They are typi- 
cally contained within a descriptorjoop in the 



MPEG-2 PMT, and may also be present in 
other ARIB-specific tables. ARIB descriptors 
not associated with the PMT are discussed in 
Chapter 18. 

Carousel Compatible Composite 
Descriptor 

This ARIB descriptor uses descriptors 
defined in the data carousel transmission spec- 
ification (ARIB STD-B24 Part 3) as sub- 
descriptors, and describes accumulation con- 
trol by applying the functions of the sub- 
descriptors. 

Component Descriptor 

This ARIB descriptor is the same as the 
one discussed in the DVB Descriptors section. 

Conditional Playback Descriptor 

This ARIB descriptor conveys the descrip- 
tion of conditional playback and the PID that 
transmits the ECM and EMM. 

Content Availability Descriptor 

This ARIB descriptor describes informa- 
tion to control the recording and output of con- 
tent by receivers. The encryption_mode flag 
indicates whether or not to encrypt the digital 
video outputs. It is used in combination with 
the Digital Copy Control Descriptor. 

Country Availability Descriptor 

This ARIB descriptor is the same as the 
one discussed in the DVB Descriptors section. 
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Data Component Descriptor 

This ARIB descriptor identifies the data 
coding system standard. The syntax is the 
same as for the DVB “ Data Broadcast ID 
Descriptor except for different field names. 

Descriptor_tag 

This 8-bit field has a value of ‘Till 1101.” 
Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Data_component_ID 

This 16-bit field identifies the data broad- 
cast specification that is used to broadcast the 
data in the broadcast network. 



Note: The following field may be repeated [n] times. 

Additional_identifier_information 

The definition of this data depends on 
data_component_ID. 

Digital Copy Control Descriptor 

This ARIB descriptor contains information 
to control copy generation. For digital record- 
ing, the broadcasting service provider uses it 
to inform digital recording equipment about 
event recording and copyright information. It 
has a descriptor_tag value of “1100 0001.” 

Emergency Information Descriptor 

This ARIB descriptor is used to transmit an 
emergency alarm. It may only be used with ter- 
restrial digital audio, terrestrial digital televi- 
sion, BS digital or broadband CS broadcasting. 
It is also present in the descriptorjoop of the 
ARIB network information table (NIT) . 



Descriptor_tag 

This 8-bit field has a value of ‘Till 1100.” 
Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following seven fields may be repeated [n] 
times. 

Service_ID 

This 16-bit field identifies the broadcast 
program number. 

Start/end_flag 

This l-bit flag has a value of “1” at the start 
of emergency information and a value of “0” 
when transmission ends. 

Signal_type 

This 1-bit flag has a value of “0” and “1,” 
respectively, when Class 1 and 2 start signals 
are transmitted. 

Reserved_bits 

These 6 bits are always “11 1111.” 
Area_code_length 

This 8-bit binary number indicates the 
number of bytes following this field. 



Note: The following two fields may be repeated [n] times. 

Area_code 

This 12-bit field indicates the area code as 
defined in Notification No. 405. 

Reserved_bits 

These 4 bits are always “1111.” 
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Hierarchical Transmission Descriptor 

This ARIB descriptor indicates the rela- 
tionship between hierarchical streams when 
transmitting events hierarchically. It has a 
descriptor Jag value of “1100 0000.” 

Linkage Descriptor 

This ARIB descriptor provides a link to 
another service, transport stream, program 
guide, service information, software upgrade, 
etc. 

Mosaic Descriptor 

This ARIB descriptor is the same as the 
one discussed in the DVB Descriptors section. 

Parental Rating Descriptor 

This ARIB descriptor is the same as the 
one discussed in the DVB Descriptors section. 

Stream Identifier Descriptor 

This ARIB descriptor is the same as the 
one discussed in the DVB Descriptors section. 

System Management Descriptor 

This ARIB descriptor identifies the type of 
broadcasting. It is also present in the 
descriptor Joop of the ARIB network informa- 
tion table (NIT) . 

Descriptor_tag 

This 8-bit field has a value of ‘Till 1110.” 
Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 



System_management_ID 

This 16-bit field identifies the type of 
broadcasting: 

bO-bl: 

00 = broadcasting 

01 = non-broadcasting 

10 = non-broadcasting 

11 = reserved 

b2-b7: 

000000 = reserved 

000001 = CS digital broadcast 

000010 = BS digital broadcast 

000011 = terrestrial digital TV broadcast 

000100 = broadband CS digital broadcast 

000101 = terrestrial digital audio broadcast 

000110-111111 = reserved 



The remaining 8 bits (b8-bl5) make up 
the additional identifier _inf or mation field, 
used to extend the broadcasting signal stan- 
dard. 

Target Region Descriptor 

This ARIB descriptor describes the target 
region of an event or a part of the stream com- 
prising an event. 

Video Decode Control Descriptor 

This ARIB descriptor is used to control 
video decoding when receiving MPEG-based 
still pictures and to smoothly display when the 
encode format changes at a video splice point. 

Descriptor_tag 

This 8-bit field has a value of “1100 1000.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 
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Still_picture_flag 

A “1” for this 1-bit flag indicates a still 
(MPEG) picture; a “0” indicates animation. 

Sequence_end_code_flag 

A “1” for this 1-bit flag indicates that the 
video stream has sequence end code at the end 
of a sequence. A “0” indicates that it does not 
have a sequence end code. 

Video_encode_format 

This Tbit codeword indicates the video 
encode format: 

0000 = 1080p 

0001 = 1080i 

0010 = 720p 

0011 = 480p 

0100 = 480i 

0101 = 240p 

0110 = 120p 

0111-1111 = reserved 



ATSC PMT Descriptors 

These ATSC descriptors are used to iden- 
tify ATSC-specific information that is present 
in the MPEG-2 transport stream. They are typ- 
ically contained within a descriptorjoop in the 
MPEG-2 PMT, and may also be present in 
other ATSC-specific tables. ATSC descriptors 
not associated with the PMT are discussed in 
Chapter 15. 

AC-3 Audio Stream Descriptor 

A Dolby® Digital (AC-3) audio elementary 
bitstream may be included within an MPEG-2 
bitstream in much the same way a standard 
MPEG audio stream is included. Like the 
MPEG audio bitstream, the Dolby® Digital bit- 
stream is packetized into PES packets. 



MPEG-2 does not explicitly support a 
Dolby® Digital bitstream. Also, the MPEG-2 
audio stream descriptor is inadequate to 
describe the contents of the Dolby® Digital bit- 
stream in the PSI tables. 

Therefore, PES packets containing Dolby® 
Digital or Dolby® Digital Plus audio data are 
sent using private stream 1. A Registration 
Descriptor (not required for DVB systems) and 
AC-3 Audio Stream Descriptor with a 
descriptor Jag value of “1000 0001” are also 
required. 

Starting March 1, 2008, this descriptor 
must carry the 3-byte ISO 639 language code, 
and must match the language code carried in 
the ISO 639 Language Descriptor, if present. 

Note that for ATSC and OpenCable™, the 
AC-3 audio descriptor is titled “AC-3 Audio 
Stream Descriptor r while for DVB, the AC-3 
audio descriptor is titled “AC-3 Descriptor ” The 
syntax of these descriptors differs significantly 
between the two systems. 

ATSC Private Information Descriptor 

This ATSC descriptor provides a way to 
carry private information. More than one 
descriptor may appear within a single 
descriptorjoop. 

Descriptor_tag 

This 8-bit field has a value of “1010 1101.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Format_identifier 

This 32-bit binary number specifies the 
owner of the following private information, reg- 
istered with the SMPTE Registration Author- 
ity, as illustrated in Table 13.52. 
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Private_data_byte 

These optional [n] bytes of private data are 
defined by the formatjdentifier owner. 

Component Name Descriptor 

This ATSC and OpenCable descriptor 
defines an optional textual name tag for any 
component of the service. 

Descriptor_tag 

This 8-bit field has a value of “1010 0011.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Component_name_string() 

Name string, based on ATSC’s Multiple 
String Structure. 

Content Advisory Descriptor 

This ATSC and OpenCable' descriptor 
defines the ratings for a given program. It is 
also present in the descriptor _loop of the ATSC 
and OpenCable event information table 
(EIT). 

Descriptor_tag 

This 8-bit field has a value of “1000 0111.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Rating_region_count 

This 6-bit binary number indicates the 
number (1-8) of rating region specifications 
that follow. 



Note: [rating_region_count] specifies how many times 
the following seven fields are repeated. 

Rating_region 

This 8-bit binary number specifies the rat- 
ing region for which the following data is 
defined. 

Rated_dimensions 

This 8-bit binary number specifies the 
number of rating dimensions for which content 
advisories are specified for this program. 



Note: [ rated _dimensions ] specifies how many times the 
following three fields are repeated. 

Rating_dimension_j 

This 8-bit binary number specifies the 
dimension index into the ATSC RRT instance 
for the region specified by the field 
ratingjregion. 

Reserved_bits 

These 4 bits are always “1111.” 

Rating_value 

This 4-bit binary number represents the 
rating value of the dimension specified by the 
field ratingjdimension J for the region given 
by ratingjregion. 



Rating_description_length 

This 8-bit binary number specifies the 
length (0-80) of the rating jdescription_text 
field that follows. 

Rating_description_text() 

Rating description string, based on ATSC’s 
Multiple String Structure. 
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Enhanced Signaling Descriptor 

This ATSC descriptor identifies the terres- 
trial broadcast transmission method of a pro- 
gram element. If the program element is an 
alternative to another, it is linked to the alter- 
native one and the broadcaster’s preferences 
are specified. 

Descriptor_tag 

This 8-bit field has a value of “1011 0010.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Linkage_preference 

This 2-bit codeword indicates if the pro- 
gram element is linked to another program ele- 
ment. If linked, it also identifies the 
broadcaster’s preference. 

00 = not linked 

01 = linked, no preference 

10 = linked, preferred 

11 = linked, not preferred 

Tx_method 

This 2-bit codeword specifies the VSB 
transmission method used to transmit the 
associated program element. 

00 = main: main coding 

01 = half-rate: rate-1/2 enhanced coding 

10 = quarter-rate: rate-1/4 enhanced coding 

11 = reserved 



Iinked_component_tag 

An optional Tbit value that links the pro- 
gram element to an alternative. The alternative 
is the program element with the same 
linkedjcomponentpag value in the transport 
stream PMT labeled with an equivalent value 
of program jnumber as the transport stream 
PMT that carries this descriptor. This field is 
present only when linkage preference = “01,” 
“10,” or “11.” 



Reserved_bits 

These optional 4 bits are always “1111.” 
This field is present only when 
linkage preference = “00.” 

Redistribution Control Descriptor 

This ATSC and OpenCable™ descriptor 
(also known as the “broadcast flag”) conveys 
any redistribution control information held by 
the program rights holder for the content. It is 
also present in the descriptorjoop of the ATSC 
and OpenCable™ event information table (EIT) . 

Descriptor_tag 

This 8-bit field has a value of “1010 1010.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

RC jnformation() 

[n] bytes of optional additional redistribu- 
tion control information that may be defined in 
the future. 
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DVB PMT Descriptors 

These DVB descriptors are used to iden- 
tify DVB-specific information that is present in 
the MPEG-2 transport stream. They are typi- 
cally contained within a descriptorjoop in the 
MPEG-2 PMT, and may also be present in 
other DVB-specific tables. DVB descriptors 
not associated with the PMT are discussed in 
Chapter 17. 

A AC Audio Descriptor 

For MPEG-4 ACC, HE-AAC, and HE-AAC 
v2 audio streams carried in PES packets, this 
DVB descriptor provides basic information for 
identifying the coding parameters. 

Descriptor_tag 

This 8-bit field has a value of “0111 1100.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Profile_and_level 

This 8-bit field identifies the profile and 
level of the MPEG-4 AAC, HE-AAC, or HE- 
AAC v2 audio. 

AAC_type_flag 

This l-bit flag indicates the presence of the 
AACjype field. 

Reserved_bits 

These 7 bits are always “000 0000.” 



AACjype 

This optional 8-bit field indicates the type 
of audio carried in the MPEG-4 AAC, HE-AAC 
or HE-AAC v2 elementary stream. This field is 
present only it AACjype Jlag = “1.” 



Additionaljnfo 

[n] bytes of optional information. 

AC-3 and Enhanced AC-3 Descriptors 

A Dolby® Digital (AC-3) or Dolby® Digital 
Plus (E-AC-3) audio elementary bitstream may 
be included within an MPEG-2 bitstream in 
much the same way a standard MPEG audio 
stream is included. Like the MPEG audio bit- 
stream, the Dolby® Digital or Dolby® Digital 
Plus bitstream is packetized into PES packets. 

MPEG-2 does not explicitly support a 
Dolby® Digital or Dolby® Digital Plus bit- 
stream. Also, the MPEG-2 audio stream 
descriptor is inadequate to describe the con- 
tents of the Dolby® Digital or Dolby® Digital 
Plus bitstream in the PSI tables. 

Therefore, PES packets containing Dolby® 
Digital or Dolby® Digital Plus audio data are 
sent using private stream 1. An AC-3 Descriptor 
with a descriptor Jag value of “0110 1010,” or an 
Enhanced AC-3 Descriptor with a descriptor Jag 
value of “0111 1010,” is also required. 

Note that for ATSC and OpenCable™, the 
AC-3 audio descriptor is titled “AC-3 Audio 
Stream Descriptor r while for DVB, the AC-3 
audio descriptor is titled “AC-3 Descriptor ” The 
syntax of these descriptors differs significantly 
between the two systems. 
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Adaptation Field Data Descriptor 

This DVB descriptor is used to indicate the 
type of data field supported in the private data 
field of the adaptation field. 

Descriptor_tag 

This 8-bit field has a value of “0111 0000.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field, and has a 
value of 0x01. 

Adaptation_field_data_identifier 

This 8-bit field identifies data fields trans- 
mitted in the private data field of the adaptation 
field. If a bit is set to “1” it indicates that the 
corresponding data field is supported. 

bO = announcement switching data field 
bl-b7 = reserved 



Ancillary Data Descriptor 

This DVB descriptor is used to indicate the 
presence and type of ancillary data in MPEG 
audio elementary streams. 

Descriptor_tag 

This 8-bit field has a value of “0110 1011.” 
Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field, and has a 
value of 0x01. 



Ancillary_data_identifier 

This 8-bit field identifies data fields trans- 
mitted in the private data field of the adaptation 
field. If a bit is set to “1” it indicates that the 
corresponding data field is supported. 

bO = DVD-Video ancillary data 
bl = extended ancillary data 
b2 = announcement switching data 
b3 = DAB ancillary data 
b4 = scale factor error check 
b5 = reserved 
b6 = reserved 
b7 = reserved 



Component Descriptor 

This ARIB and DVB descriptor indicates 
the type of stream and may be used to provide 
a text description of the stream. For DVB, it is 
only present in the EIT and SIT. 

Descriptor_tag 

This 8-bit field has a value of “0101 0000.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

ReservecLbits 

These 4 bits are always “0000.” 

Stream_content 

This 4-bit codeword indicates the type of 
content in the stream (audio, video, or data). 

Component Jype 

This 8-bit codeword indicates the type of 
audio, video, or data. 
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Component_tag 

This 8-bit field has the same value as 
comp onent, Jag in the Stream Identifier Descrip- 
tor for the component stream. 

IS0_639_language_code 

This 24-bit field contains a 3-character lan- 
guage code. 

Text_char 

[n] bytes that specify a text description of 
the stream. 

Country Availability Descriptor 

This ARIB and DVB descriptor identifies 
countries that are either allowed or not allowed 
to receive the service. The descriptor may 
appear twice for each service, once for listing 
countries allowed to receive the service, and a 
second time for listing countries not allowed to 
receive the service. The latter list overrides 
the former list. 

Descriptor_tag 

This 8-bit field has a value of “0100 1001.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Country_availability_flag 

A “1” for this l-bit flag indicates the coun- 
try codes specify countries that may receive 
the service. A “0” indicates the country codes 
specify the countries that may not receive the 
service. 



Country_code 

[n] 24-bit fields that identify countries, 
using the 3-character code as specified in ISO 
3166. 

Data Broadcast ID Descriptor 

This DVB descriptor identifies the data 
coding system standard. 

Descriptor_tag 

This 8-bit field has a value of “0110 0100.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Data_broadcast_ID 

This 16-bit field identifies the data broad- 
cast specification that is used to broadcast the 
data in the broadcast network. Allocations of 
the value of this field are found in ETR 162. 



Note: The following field may be repeated [n] times. 

ID_selector_byte 

The definition of this data depends on 
data_broadcast_ID. 

DTS Audio Descriptor 

When a DTS® audio stream is included in a 
DVB transport stream, this descriptor must 
also be included. It has a descriptor Jag value of 
“ 0111 1011 .” 

Either this descriptor or the MPEG-2 Reg- 
istration Descriptor must also be located in the 
PMT and SIT to identify a DTS® stream. 
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Extension Descriptor 

This DVB descriptor is used to extend the 
8-bit value of descriptor Jag. It has a 
descriptor Jag value of “0111 1111.” 

Mosaic Descriptor 

A mosaic component is a collection of dif- 
ferent video images to form a coded video 
component. The information is organized so 
that each specific information when displayed 
appears on a small area of a screen. 

This ARIB and DVB descriptor partitions a 
digital video component into elementary cells, 
the allocation of elementary cells to logical 
cells, and links the content of the logical cell 
and the corresponding information (e.g. bou- 
quet, service, event, etc.). It has a 
descriptor Jag value of “0101 0001.” 

Parental Rating Descriptor 

This ARIB and DVB descriptor gives a rat- 
ing based on age and offers extensions to be 
able to use other rating criteria. For DVB, it is 
only present in the EIT and SIT. 

Descriptor_tag 

This 8-bit field has a value of “0101 0101.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following two fields are repeated [ n ] times. 

Country_code 

This 24-bit field identifies a country using 
the 3-character code, as specified in ISO 3166. 



Rating 

This 8-bit field indicates the recommended 
minimum age in years of the viewer. 

Private Data Specifier Descriptor 

This DVB descriptor is used identify the 
specifier of any private descriptors or private 
fields within descriptors. 

Descriptor_tag 

This 8-bitfield has a value of “0101 1111.” 
Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field, and has a 
value of 0x04. 

Private_data_specifier 

The assignment of values for this field is 
given in ETR 162. 

Scrambling Descriptor 

This DVB descriptor indicates the selected 
mode of operation for the scrambling system. 

Descriptor Jag 

This 8-bit field has a value of “0110 0101.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Scrambling_mode 

This 8-bit value identifies the selected 
mode of the DVB Common Scrambling Algo- 
rithm. 
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Service Move Descriptor 

This DVB descriptor enables a decoder to 
track a service when it is moved from one 
transport stream to another. 

Descriptor_tag 

This 8-bit field has a value of “0110 0000.” 

Deseriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

New_original_network_ID 

This 16-bit field specifies the 
original _network_ID of the transport stream in 
which the service is found after the move. 

New_transport_stream_ID 

This 16-bit field specifies the 
transport_stream_ID of the transport stream in 
which the service is found after the move. 

New_service_ID 

This 16-bit field specifies the service_ID of 
the service after the move. 

Stream Identifier Descriptor 

This ARIB and DVB descriptor enables 
specific streams to be associated with a 
description in the EIT. This is used where 
there is more than one stream of the same type 
within a service. 

Descriptor_tag 

This 8-bit field has a value of “0101 0010.” 



Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field, and has a 
value of 0x01. 

Component_tag 

This 8-bit field identifies the component 
stream associated with a component descrip- 
tor. Within the PMT, each Stream Identifier 
Descriptor has a different value for this field. 

Subtitling Descriptor 

This DVB descriptor identifies ETSI EN 
300 743 subtitle data. 

Descriptor_tag 

This 8-bit field has a value of “0101 1001.” 

Deseriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following four fields may he repeated [n] times 
to allow identifying multiple data services using a single 
descriptor. 

ISO_639_language_code 

This 24-bit field contains a 3-character lan- 
guage code. 

Subtitlingjype 

This 8-bit field provides information on the 
content of the subtitle and the intended display. 

Composition_page_ID 

This 16-bit field identifies the composition 
page. 
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Ancillary_page_ID 

This 16-bit field identifies the (optional) 
ancillary page. 

Teletext Descriptor 

This DVB descriptor is used to identify ele- 
mentary streams which carry EBU Teletext 
data. 

Descriptor_tag 

This 8-bit field has a value of “0101 0110.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following four fields may be repeated [ n ] times 
to allow identifying multiple data services using a single 
descriptor. 

ISO_639_language_code 

This 24-bit field contains a 3-character lan- 
guage code. 

Teletext Jype 

This 5-bit codeword specifies the type of 
teletext page: 

0x01 = initial teletext page 

0x02 = teletext subtitle page 

0x03 = additional information page 

0x04 = program schedule page 

0x05 = teletext subtitle page for hearing impaired 

Teletext_magazine_number 

This 3-bit binary number identifies the 
magazine number. 



Teletext_page_number 

This 8-bit field specifies the teletext page 
number as two 4-bit hex digits. 

VBI Data Descriptor 

This DVB and OpenCable ™ descriptor 
defines the VBI service type in the associated 
packetized elementary stream (PES) . 

Descriptor_tag 

This 8-bit field has a value of “0100 0101.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 



Note: The following fields may be repeated [n] times to 
allow identifying multiple data services using a single 
descriptor. 

Data_serviceJD 

This 8-bit binary number identifies the 
type of VBI data present in the associated ele- 
mentary stream. It has a value of: 

0x01 = EBU teletext 

0x02 = EBU teletext with inverted framing code 
0x04 = video program system (VPS) 

0x05 = widescreen signaling (WSS) 

0x06 = closed captioning 

0x07 = monochrome 4:2:2 samples 

0xF7 = vertical interval timecode (VITC) 

0xF9 = copy protection 
0xFB= TV Guide 
OxFC = NABTS 
OxFE = AMOL I / II 




704 Chapter 13: MPEG-2 



Data_service_description_length 

This 8-bit binary number indicates the 
number of bytes following this field. 



Note: The following fields are present, and may be 
repeated [n] times, when data_service_ID = 0x01, 0x02, 
0x04, 0x05, 0x06, 0x07, 0xF7, OxFB, OxFC, or OxFE. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 

Line_offset 

This 5-bit binary number specifies the line 
number the VBI data is to be inserted on for a 
480i or 576i video signal. When field _parity = 
“0,” a value of 263 D (480i) or 313 D (576i) is 
added to the line_offset value to obtain the line 
number. 



Note: The following field is present, and may be repeated 
[n] times, when data_service_ID ^ 0x01, 0x02, 0x04, 
0x05, 0x06, 0x07, 0xF7, OxFB, OxFC, or OxFE. 

Reserved_bits 

These 8 bits always have a value of ‘Till 
1111 .” 

VBI Teletext Descriptor 

The syntax for this descriptor is the same 
as for Teletext Descriptor. The only difference is 
that it is not used to associate streamjype 0x06 
with either the VBI or EBU teletext standard. 
Decoders use the languages in this descriptor 
to select magazines and subtitles. It has a 
descriptor Jag value of “0100 0110.” 



OpenCable PMT Descriptors 

These additional descriptors are used to 
carry OpenCable -related information. They 
are contained within a descriptorjoop in the 
MPEG-2 PMT and may also be present in 
other OpenCable -specific tables. OpenCa- 
ble descriptors not associated with the PMT 
are discussed in Chapter 16. 

AC-3 Audio Stream Descriptor 

This OpenCable™ descriptor is the same as 
the one discussed in the ATSC Descriptors 
section. 

Component Name Descriptor (ATSC) 

This OpenCable™ descriptor is the same as 
the one discussed in the ATSC Descriptors 
section. 

Component Name Descriptor (SCTE) 

This OpenCable™ descriptor defines an 
optional textual name tag for any component of 
the service. 

Descriptor_tag 

This 8-bit field has a value of “1000 0100.” 

Deseriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Reserved_bits 

These 2 bits always have a value of “11.” 

String_count 

This 6-bit binary number, with a range of 
1-63, specifies the number of name strings 
being defined in this descriptor. 
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Note: [string _count] specifies how many times the follow- 
ing three fields are repeated. 

IS0_639_language 

This 24-bit field contains a 3-character lan- 
guage code. 

Stringjength 

This 8-bit binary number, with a range of 
1-31, specifies the length in bytes of the multi- 
lingual name string that follows. 

Name_string() 

Variable-length text string. 

Content Advisory Descriptor 

This OpenCable™ descriptor is the same as 
the one discussed in the ATSC Descriptors 
section. 

Extended Video Descriptor 

This OpenCable™ descriptor identifies 
some attributes that may be needed for pro- 
cessing. 

Descriptor_tag 

This 8-bit field has a value of “1000 0011.” 

Descriptor Jength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Catalog_mode_flag 

A “1” for this l-bit flag indicates that the 
video stream supports applications that select 
and display (video hold) single frames from 
the processed bitstream. A “0” indicates the 
bitstreams are decoded and displayed nor- 
mally. 



Video_includes_setup 

A “1” for this 1-bit flag indicates that the 
video in the bitstream includes a 7.5 IRE blank- 
ing pedestal. A “0” indicates that the video in 
the bitstream does not include a 7.5 IRE blank- 
ing pedestal. 

Reserved_bits 

These 6 bits always have a value of “11 
1111 .” 

Frame Rate Descriptor 

This OpenCable™ descriptor identifies the 
video frame rate. 

Descriptor_tag 

This 8-bit field has a value of “1000 0010.” 

Descriptorjength 

This 8-bit binary number specifies the 
number of bytes following this field. 

Multiple Jrame_rateJlag 

A “1” for this 1-bit flag indicates that multi- 
ple frame rates may be present in the video 
stream. A “0” indicates that only a single frame 
rate is present. 

Frame_rate_code 

This 4-bit codeword specifies frame rate 
present in the video stream. If 
multiple _Jrame_rate Jlag is a “1,” additional 
frame rates may also be present. 

Reserved_bits 

These 3 bits always have a value of “111.” 
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MAC Address List Descriptor 

This OpenCable™ descriptor is used when 
implementing IP (Internet Protocol) multicast- 
ing over MPEG-2 transport streams. Streams 
carrying IP data are identified as containing 
DSM-CC sections by assigning streamjype = 
OxOD within the PMT. 

This descriptor is used to identify the data, 
by multicast MAC group addresses, being car- 
ried by each data elementary stream. It has a 
descriptor Jag value of “1010 1100.” 

Redistribution Control Descriptor 

This OpenCable™ descriptor is the same as 
the one discussed in the ATSC Descriptors 
section. 

VBI Data Descriptor 

This OpenCable™ descriptor is the same as 
the one discussed in the DVB Descriptors sec- 
tion. 



Closed Captioning 

CEA-608 and CEA-708 are the primary 
closed captioning standards. While CEA-608 
(discussed in Chapter 8) was originally 
designed for use with NTSC broadcasts, CEA- 
708 is designed for use with digital TV broad- 
casts. 

There are currently no standards for con- 
veying closed captioning data to TVs using the 
HDMI or 480p/720p/1080i/1080p analog 
YPbPr interfaces. In these cases, closed cap- 
tion decoding and display must be done in the 
box where the compressed video decoding is 
performed. 



CEA-708 

The CEA-708 DTV closed captioning stan- 
dard makes a number of changes to the NTSC- 
based CEA-608 closed captioning standard. 
The focus is on giving viewers better looking 
information, and giving them more control 
over it. 

Most important, more information can 
now be included. CEA-608 supports up to 960 
bps for captioning information, while CEA-708 
reserves a constant 9600 bps for captioning 
(including the 960 bps used for CEA-608 cap- 
tioning). 

Viewers can control the size of the caption- 
ing text. Those with poor vision can make it 
larger, those who prefer captions not cover so 
much of the picture can make it smaller, and 
everyone else can leave them as they are. 

CEA-708 also offers more letters and sym- 
bols, supporting multilingual captioning. While 
the CEA-608 character set doesn’t have all of 
the letters and accent marks needed for proper 
captioning in languages like French, Spanish, 
German, Italian, or Portuguese, CEA-708 pro- 
vides all of these and more. 

Support for multiple fonts and more colors 
eliminates the familiar clunky monospaced 
white-on-black look. Eight fonts (including pro- 
portional spaced, casual, and script fonts) and 
up to 64 text and background colors are speci- 
fied, although caption decoders aren’t required 
to support all the fonts and colors. This allows 
captioners to improve the look of the captions; 
however, they will have to take into consider- 
ation how the captions will appear on televi- 
sions without the multiple font support. 

The additional color support means the 
traditional black box background can be 
replaced by a colored box, done away with 
entirely in favor of edged or drop-shadowed 
text. The caption box can also be made translu- 
cent (see-through). 
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Figure 13.25. DTV Closed Captioning Data in MPEG-2 Bitstream. 



Data Type 


Code 


ATSC 


DVB 


OpenCable 


SCTE 

21 


closed captions 


0x03 


yes 


yes 


yes 


yes 


additionaLCEA_608_data 


0x04 


- 


- 


yes 


yes 


luma_PAM data 


0x05 


- 


- 


yes 


yes 


bar information 


0x06 


yes 


yes 


yes 


yes 



Table 13.54. user_data_type_code Codewords. 
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CEA-708 allows adding closed captioning 
data to a MPEG-2 transport stream using 
userjdata at the sequence, GOP, or picture 
layer of the video bitstream. Figure 13.25 illus- 
trates the DTV closed captioning protocol 
model. 

CEA-608 captioning data is not embedded 
within the DTV protocol stack. This allows it to 
be extracted at the transport layer, enabling 
simpler captioning decoder designs since the 
entire DTV closed caption channel bitstream 
does not have to be parsed to recover the two 
bytes of CEA-608 data. 

MPEG coded pictures are transmitted in a 
different order than they are displayed. Cap- 
tioning data is similarly reordered, so must be 
reordered (by the decoder) , along with the pic- 
tures to which they correspond, prior to packet 
location and extraction. 

MPEG-2 Video 

CEA-708 closed captioning uses a continu- 
ous 9600 bps stream allocated from the signal 
capacity. The captioning data is allocated on a 
frame-by-frame basis so that 1200 bytes of data 
are transported per second. This enables up to 
20 bytes of caption data per frame for a 480i30 
or 1080i30 signal. On average, CEA-608 cap- 
tions are allocated 960 bps, and CEA-708 cap- 
tions are allocated 8640 bps. 

Closed captioning may only be present at 
the picture layer. The bitstream syntax is: 

U ser_data_start_code 

This 32-bit string has a value of 
0x000001B2. 

User_identifier 

A 32-bit value of 0x47413934 indicates that 
the user data conforms to the CEA-708 closed 
captioning specification. 



User_data_type_code 

The value of this 8-bit codeword specifies 
the type of information that follows, as indi- 
cated in Table 13.54. Other values are either in 
use in other standards or are reserved for 
future use. 



Note: The following eleven fields are present when 
user_data_type_code = 0x03. 

Reserved_bit 

This bit is always “1.” 

Process_cc_data_flag 

If this bit is a “1,” the ccjdata must be pro- 
cessed. If it is a “0,” the ccjdata can be dis- 
carded. 

Zero_bit 

This bit is always a “0.” 

Cc_count 

This 5-bit binary number, with a range of 
0-31, indicates the number of closed caption 
constructs following this field. The value is set 
such that a fixed bandwidth of 9600 bps is 
maintained for the closed caption data. 

Reserved_bits 

These 8 bits are always ‘Till 1111.” 



Note: [cc_count] specifies how many times the following 
five fields are repeated. 

Marker_bits 

These 5 bits are always “1 1111.” 
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Cc_valid 

If this bit is a “1,” the two closed caption 
data bytes are valid. If it is a “0,” the two data 
bytes are invalid. If invalid, CEA-608 clock run- 
in and start bits are not generated. 

Cc_type 

These 2 bits specify the type of closed cap- 
tion data that follows, as shown in Table 13.55. 



CC Type 


Code 


CEA-608 line 21 field 1 captions 


00 


CEA-608 line 21 field 2 captions 


01 


CEA-708 channel packet data 


10 


CEA-708 channel packet start 


11 



Table 13.55. cc_type Codewords. 



Cc_data_l 

The first 8 bits of closed caption data. They 
are processed only if process_cc_data Jlag is a 
“ 1 ” 

Cc_data_2 

The second 8 bits of closed caption data. 
They are processed only if process_cc_data Jlag 
is a “1.” 



Marker_bits 

These 8 bits are always ‘Till 1111.” 



Note: The following seven fields are present when 
user_data_type_code = 0x04. 

Marker_bits 

These 3 bits are always “111.” 



Additional_cc_count 

This 5-bit binary specifies the number of 
lines CEA-608 data is present. 



Note: [additional_cc_count] specifies how many times 
the following five fields are repeated. 

Additional_cc_valid 

If this bit is a “1,” the two closed caption 
data bytes are valid. If it is a “0,” the two data 
bytes are invalid. 

Additional_cc_line_offset 

This 5-bit binary number specifies the off- 
set in lines from which the CEA-608 closed 
caption data originated relative to lines 9 and 
272 for 480i systems or lines 5 and 318 for 576i 
systems. 

Additional_cc_field_number 

This 2-bit codeword indicates the number 
of the field, in display order, from which the 
CEA-608 data originated. 

00 = forbidden 

01 = 1st field 

10 = 2nd field 

11 = 3rd display field (repeated field in film mode) 

Additional_cc_data_ 1 

The first 8 bits of CEA-608 closed caption 
data. They are processed only if 
additional_cc_valid is a “1.” 

Additional_cc_data_2 

The second 8 bits of CEA-608 closed cap- 
tion data. They are processed only if 
additional _cc_valid is a “1.” 
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Note: The following field is present when a 

user_data_type_code is used that is not listed in Table 
13.54. 

Reserved_user_data 

These optional 8 bits are reserved. 

MPEG-2 Video (SCTE 21) 

The SCTE 21 specification, used by Open- 
Cable , forms the basis for digital cable closed 
captioning. 

SCTE 21 extends ATSC closed captioning 
to better support CEA-608 captions. This is 
because some cable systems use the CEA-608 
closed caption format to carry non-captioning 
data on other VBI lines. 

A pulse-amplitude modulation (PAM) 
scheme is also available to transfer basic VBI 
waveforms, such as: 

CEA-608-compliant closed captioning for one 
or more VBI lines other than line 21 

Nielsen Source Identification (SID) /Auto- 
mated Measurement of Lineups (AMOL) sig- 
nals 

North American Basic Teletext per the ELA- 
516 NABTS Specification 

World System Teletext (WST) 

Vertical Interval Timecode (VITC) 

Although most standards use two-level 
luminance encoding, multi-level PAM coding is 
also supported. 



MPEG-2 Video (SCTE 20) 

The SCTE 20 specification is an early stan- 
dard for digital cable closed captioning, which 
may be present at the picture layer as 
user_data. Both SCTE 20 and 21 must be sup- 
ported by OpenCable and other digital cable 
decoders to ensure backwards compatibility 
with current systems. 

User_data_start_code 

This 32-bit string has a value of 
0x00000 1B2. 

User_data_type_code 

The value of this 8-bit codeword is 0x03, 
indicating closed captioning information. 

Reserved_bits 

These 7 bits are always “100 0000.” How- 
ever, some early cable systems instead use 
“000 0000,” so the value of this field should be 
ignored by decoders. 

VBI_data_flag 

If this bit is a “1,” one or more VBI data 
constructs follow. 

Cc_count 

This 5-bit binary number, with a range of 
0-31, indicates the number of closed caption 
constructs following this field. 
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Note: [cc_count] specifies how many times the following 
six fields are repeated. 

Cc_priority 

This 2-bit codeword indicates the priority 
of constructs in picture reconstruction. “00” is 
highest priority and “11” is lowest priority. 

Field_number 

This 2-bit codeword indicates the number 
of the field, in display order, from which the 
CEA-608 data originated. 

00 = forbidden 

01 = 1st field 

10 = 2nd field 

11 = 3rd display field (repeated field in film mode) 

Line_offset 

This 5-bit binary number specifies the off- 
set in lines from which the CEA-608 closed 
caption data originated relative to lines 10 and 
273 for 480i systems or lines 6 and 319 for 576i 
systems. 

Cc_data_l 

The first 8 bits of closed caption data. 

Cc_data_2 

The second 8 bits of closed caption data. 

Marker_bit 

This bit is always “1.” 



N on_real_time_video_count 

This Tbit binary number has a value of 0- 
15, and indicates the number of non-real-time 
video constructs that follow. 



Note: [non_real_time_video_count] specifies how many 
times the following eight fields are repeated. 

Non_real_time_video_priority 

This 2-bit codeword indicates the priority 
of constructs in non-real-time VBI data. “00” is 
highest priority and “11” is lowest priority. 

Sequence_number 

This 2-bit binary number increments by 
one between sequences. A value of “00” indi- 
cates the non-real-time-sampled video line is 
not to be reconstructed (is inactive) until a seg- 
ment is received with a non-zero 
sequencejnumber. 

Non_real_time_video_field_number 

This l-bit flag indicates whether to recon- 
struct the data into the odd field (“0”) or even 
field (“1”). 

Iine_offset 

This 5-bit binary number specifies the off- 
set in lines from which the VBI data originated 
relative to lines 10 and 273 for 480i systems or 
lines 6 and 319 for 576i systems. 



Note: The following four fields are present when 
sequencejnumber * 00. 

Segment_number 

This 5-bit binary number specifies the 
number of the non-real-time sampled video 
segment, starting with “0 0001.” 

Non-real-time sampled video is segmented 
into 64-byte segments and transport each as an 
array of 32 luminance (Y) samples followed by 
an array of 16 chrominance sample pairs (Cb, 
Cr) , starting with the most significant bit of the 
left-most sample. All segments of the sequence 
shall be transmitted in order before any seg- 
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ment of a new sample of the same non-real- 
time video line. 



Note: The following field is repeated thirty-two times. 

N on_real_time_video_Y_data 

Eight bits of non-real-time Y data for this 
segment. 



Note: The following two fields are repeated sixteen times. 

N on_real_time_video_Cb_data 

Eight bits of non-real-time Cb data for this 
segment. 

N on_real_time_video_Cr_data 

Eight bits of non-real-time Cr data for this 
segment. 

MPEG-4.10 (H.264) Video 

Closed captioning is carried in the SEI 
RBSP syntax of the video elementary stream. 

User_data_start_code field of the MPEG-2 
captioning syntax are replaced with the two fol- 
lowing fields: 

Itu_t_t3 5_countr y_code 

This 8-bit field has a value of 0xB5. 

Itu_t_35_provider_code 

This 16-bit field has a value of 0x0031. 

SMPTE 421M (VC-1) Video 

Closed captioning is optionally carried in 
the user data of the video elementary stream. 



User_data_startjcode field of the MPEG-2 
captioning syntax are replaced with the follow- 
ing field: 

VC 1 _user_data_start_code 

This 32-bit string of 0x000001 ID indicates 
the beginning of userjdata. 

VBI Standard 

The “VBI standard,” discussed next, also 
defines how to add closed captioning data to an 
MPEG-2 transport stream. 



VBI Standard 

The ETSI EN 301 775 and OpenCable™ 
standards define how to add closed captioning, 
teletext, video program system (VPS), wide- 
screen signaling (WSS) data, etc. to an MPEG- 
2 transport stream for DVB and digital cable 
applications. The data is carried in MPEG-2 
PES packets as private stream 1 which are in 
turn carried by transport packets. Although 
originally designed for the DVB standard, it is 
applicable to any MPEG-2 bitstream. Use of 
the DVB VBI Teletext Descriptor is required. 

The syntax for the PES data field is: 

Data_identifier 

This 8-bit binary number identifies the 
type of data carried in the PES packet. It has a 
value of 0x10 to OxlF and 0x99 to 0x9B. For 
OpenCable , a value of 0x99 is used. 



Note: The following fields may be repeated [n] times to 
allow transmission of multiple types of data within a sin- 
gle stream. 
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Data_unit_ID 

This 8-bit binary number identifies the 
type of data present. It has a value of: 

0x02 = EBU teletext non-subtitle data 

0x03 = EBU teletext subtitle data 

QxCO = EBU teletext with inverted framing code 

QxC3 = video program system (VPS) 

QxC4 = widescreen signaling (WSS) 

QxC5 = closed captioning 

QxC6 = monochrome 4:2:2 samples 

OxDO = AMOL I 

OxDl = AMOL II 

0xD5 = U.S teletext (NABTS) 

0xD6 = TV Guide 

0xD7 = copy protection 

0xD9 = vertical interval timecode (VITC) 

QxFF = stuffing 



Data_unit_length 

This 8-bit binary number indicates the 
number of bytes following this field. If 
datajdentifier has a value between 0x10 and 
OxlF inclusive, this field must be set to 0x2C. 



Note: The following fields are present when 

data_unit_ID = 0x02, 0x03, or OxCO. This packet is used 
to convey EBU teletext information (ETSI EN 300 706). 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 



Line_offset 

This 5-bit binary number specifies the 576i 
line number the teletext data is to be inserted 
on. When field _parity = “0,” a value of 313p> is 
added to the line_offiet value. 

Framing_code 

This 8-bit field specifies the framing code 
to be used. For EBU teletext, it has a value of 
“1110 0100.” For EBU inverted teletext, it has a 
value of “0001 1011.” 

Inverted teletext is used to carry signals 
that are not intended for public reception, such 
as downstream controls for cable head-end 
equipment, schedules, etc. The use of inverted 
teletext is on the decline with many broadcast- 
ers now instead using teletext packet 31. 

Txt_data_block 

This 336-bit field corresponds to the 42 
bytes of 576i EBU teletext data that follows the 
clock run-in (clock sync) and framing code 
(byte sync). 



Note: The following fields are present when 

data_unit_ID = 0xC3. This packet is used to convey VPS 
information (ETSI EN 300 231). 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

Decoders need only implement field _parity 
when it is a “1.” They may ignore packets when 
this bit is a “0.” 
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Line _offset 

This 5-bit binary number specifies the line 
number the VPS data is to be inserted on for a 
576i video signal. Decoders need only imple- 
ment line_offset when it is “1 0000,” they may 
ignore other lines. 

VPS_data_block 

This 104-bit field corresponds to the 13 
bytes of 576i VPS data that follows the clock 
run-in and start code data. 



Note: The following fields are present when 

data_unit_ID = 0xC4. This packet is used to convey PfSS 
information (ETSI EN 300 294). 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

Decoders need only implement field _parity 
when it is a “1.” They may ignore packets when 
this bit is a “0.” 

Line_offset 

This 5-bit binary number specifies the line 
number the WSS data is to be inserted on for a 
576i video signal. Decoders need only imple- 
ment line_offset when it is “1 0111,” they may 
ignore other lines. 

WSS_data_block 

This 14-bit field corresponds to the 14 bits 
of 576i WSS data that follows the run-in and 
start code data. 



Reserved_bits 

These 2 bits always have a value of “11.” 



Note: The following fields are present when 

data_unit_ID = 0xC5. This packet is used to convey 
CEA-608 closed captioning information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 (line 21) 
data; a “0” indicates Field 2 (line 284) data. 

Line_offset 

This 5-bit binary number specifies the line 
number the caption data is to be inserted on 
for a 480i video signal. Decoders need only 
implement line_offset when it is “1 0101,” they 
may ignore other lines. 

Closed L captioningi_ data_block 

This 16-bit field corresponds to the 16 bits 
of 480i CEA-608 closed captioning data. 



Note: The following fields are present when 

data_unit_ID = OxDO. This packet is used to convey 
AMOL I information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 
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Line _offset 

This 5-bit binary number specifies the line 
number the AMOL data is to be inserted on for 
a 480i video signal. Valid values are “0 1010” 
through “1 0110” (10 D -22 D ). 

AM0L48_data_block 

This 41-bit field corresponds to the 41 bits 
of 480i AMOL I data that follow the 7-bit 
AMOL I start _of jnessage header. 

Trailer 

These 7 bits always have a value of “000 
0000 .” 



Note: The following fields are present when 

data_unit_ID = OxDl. This packet is used to convey 
AMOL II information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 

Line_offset 

This 5-bit binary number specifies the line 
number the AMOL data is to be inserted on for 
a 480i video signal. Valid values are “0 1010” 
through “1 0110” (10 D -22 D ). 



AMOL96_data_block 

This 88-bit field corresponds to the 88 bits 
of 480i AMOL II data that follow the 8-bit 
AMOL II start _of _message header. 



Note: The following fields are present when 

data_unit_ID = 0xD5. This packet is used to convey U.S. 
Teletext (NABTS) information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates field 1 data; a “0” 
indicates field 2 data. 

Line_offset 

This 5-bit binary number specifies the line 
number the NABTS data is to be inserted on 
for a 480i video signal. Valid values are “0 1010” 
through “1 0110” (10 D -22 D ). 

Framing_code 

This 8-bit binary number has a value of 
“1110 0111 .” 

NA BTS_ data_block 

This 264-bit field corresponds to the 33 
bytes that follow the clock run-in (clock sync) 
and framing code (byte sync) . 
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Note: The following fields are present when 

data_unit_ID = 0xD6. This packet is used to convey TV 
Guide information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 

Line_offset 

This 5-bit binary number specifies the line 
number the TV Guide data is to be inserted on 
for a 480i video signal. Valid values are “0 1010” 
through “1 0110” (10 D -22 D ). 

TVG2X_data_block 

This 32-bit field corresponds to the 32 bits 
of 480i TV Guide data that follows the clock 
run-in and framing code. 



Note: The following fields are present when 

data_unit_ID = 0xD7. This packet is used to convey copy 
protection information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 



Line _offset 

This 5-bit binary number specifies the line 
number the copy protection data is to be 
inserted on for a 480i video signal. Valid values 
are “0 1100” through “1 0110” (14 D -22 D ). 

CP_data_block 

This 2-bit field corresponds to bits 7 and 8 
of IEC 61880, section B.2. 

Reserved_bits 

These 6 bits always have a value of “11 
1111 .” 



Note: The following fields are present when 

data_unit_ID = 0xD9. This packet is used to convey 
V1TC information. 

Reserved_bits 

These 2 bits always have a value of “11.” 

Field_parity 

A “1” for this bit indicates field 1 data; a “0” 
indicates field 2 data. 

Line_offset 

This 5-bit binary number specifies the line 
number the VITC data is to be inserted on for a 
480i video signal. Valid values are “0 1100” 
through “1 0110” (14 D -22 D ). 

VITC_data_block 

This 64-bit field corresponds to the 64 
active data bits (excluding sync bits) of VITC 
data. 
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Note: The following fields are present when 

data_unit_ID = OxFF. 

Stuffing_byte 

These [n] bytes have a value of ‘Till 
1111.” Any number of stuffing bytes may be 
present and they are ignored by the decoder. 



Teletext 

Two standards for teletext transmission 
over MPEG-2 are available, the DVB “VBI stan- 
dard” (reviewed in the previous section) and 
the newer “EBU Teletext” standard. The “EBU 
Teletext” standard only supports the transmis- 
sion of EBU teletext data. The DVB “VBI stan- 
dard” allows transmission of multiple types of 
VBI data. 

Some systems may support only the “EBU 
Teletext” standard, others just the DVB ‘VBI 
standard,” and others both standards. In the 
case where both standards must be supported, 
separate PIDs are used; teletext data is broad- 
cast on both PIDs. 

DVB EBU Teletext Standard 

The ETSI EN 300 472 standard defines 
how to add EBU teletext to an MPEG-2 trans- 
port stream for DVB applications. The data is 
carried in MPEG-2 PES packets as private 
stream 1 which are in turn carried by transport 
packets. Although designed for the DVB stan- 
dard, it is applicable to any MPEG-2 bitstream. 
Use of the DVB Teletext Descriptor is required. 

The syntax for the PES data field is: 



Data_identifier 

This 8-bit binary number identifies the 
type of data carried in the PES packet. It has a 
value of 0x10 to OxlF. 



Note: The following fields may be repeated [n] times. 

Data_unit_ID 

This 8-bit binary number identifies the 
type of data present. It has a value of: 

0x02 = EBU teletext non-subtitle data 
0x03 = EBU teletext subtitle data 

Data_unit_length 

This 8-bit binary number indicates the 
number of bytes following this field, and has a 
value of 0x2C. 

Reserved_bits 

These 2 bits always have a value of “11.” 

FielcLparity 

A “1” for this bit indicates Field 1 data; a 
“0” indicates Field 2 data. 

Iine_offset 

This 5-bit binary number specifies the line 
number the teletext data is to be inserted on 
for a 576i video signal. Only values of “0 0111” 
to “1 0110” are valid. When field J>arity = “0,” a 
value of 313j) is added to the line j>ff set value to 
obtain the line number. 

Framing_code 

This 8-bit field specifies the framing code 
to be used. It has a value of “1110 0100.” 
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Magazine_and_packet_address 

This 16-bit field corresponds to the maga- 
zine and packet address. 

Data_block 

This 320-bit field corresponds to the 
remaining 40 bytes of teletext data. 

Active Format Description 
(AFD) 

AFD (a part of ATSC A/53, ETSI TS 101 
154, and CEA-805) should be included in video 
user data whenever the rectangular picture 
area containing useful information does not 
extend to the full height or width of the coded 
frame. The functionality of AFD is similar to 
that of widescreen signaling (WSS), described 
in Chapter 8. 

MPEG-2 Video 

AFD is optionally carried in the user data 
of video elementary bitstreams, after the 
sequence extension, GOP header, and/or pic- 
ture coding extension. 

U ser_data_start_code 

This 32-bit string of 0x000001B2 indicates 
the beginning of user_data. 

User_identifier 

A 32-bit value of 0x44544731 indicates that 
the syntax is for AFD. 

Zero_bit 

Always a “0.” 



Active_format_flag 

If bit is set to a “1,” an active format is 
described in this data structure. 

Reserved_bits 

Always “00 000 1.” 

Reserved_bits 

These optional bits are always “1111.” 
They are present if active Jormat Jlag = “1.” 

Active_format 

These optional bits specify the area of 
interest as shown in Table 13.56. They are 
present only if active Jormat Jlag = “l.” 

MPEG-4.10 (H.264) Video 

AFD is optionally carried in the SEI RBSP 
syntax of the video elementary stream. 

User_data_start_code field of the MPEG-2 
AFD syntax is replaced with the two following 
fields: 

Itu_t_t3 5_country_code 

This 8-bit field has a value of 0xB5. 

Itu_t_3 5_provider_code 

This 16-bit field has a value of 0x0031. 

SMPTE 421M (VC-1) Video 

AFD is optionally carried in the user data 
of the video elementary stream. 

User_data_start_code field of the MPEG-2 
AFD syntax is replaced with the following field: 

VC 1 _user_data_start_code 

This 32-bit string of 0x000001 ID indicates 
the beginning of userjdata. 
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AFD 

Active Format 
Value 
(R3-R0) 


wss Bits 
b3, b2. bl, bO 


4:3 Coded Frames 


16:9 Coded Frames 


0000 - 0001 


reserved 


0010 


0100 


Not recommended 


Not recommended 


0011 


0010 


Not recommended 


Not recommended 


0100 


1101 


>16:9 aspect ratio 


>16:9 aspect ratio 


0101 -0111 


reserved 


1000 


- 


4:3 full frame image 


16:9 full frame image 


1001 


1000 


4:3 full frame image 


4:3 pillarbox image 


1010 


1011 


16:9 letterbox image 


16:9 full frame image 


1011 


0001 


14:9 letterbox image 


14:9 pillarbox image 


1100 


reserved 


1101 


1110 


4:3 full frame image, 
alternative 14:9 center 


4:3 pillarbox image, 
alternative 14:9 center 


1110 


- 


16:9 letterbox image, 
alternative 14:9 center 


16:9 full frame image, 
alternative 14:9 center 


1111 


- 


16:9 letterbox image, 
alternative 4:3 center 


16:9 full frame image, 
alternative 4:3 center 



Table 13.56. AFD activejormat Values. Any combination of Active Format Description and bar 
data may be present in video user data (either, neither, or both). Note that AFD data may not 
always exactly match bar data because AFD only deals with 4:3, 14:9, and 16:9 aspect ratios 
while bar data can accurately represent nearly any aspect ratio. If AFD data is in conflict with bar 
data, AFD data will take precedence unless AFD = “0000” or “0100.” 
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Subtitles 

Subtitles consist of one or more com- 
pressed bitmap images, along with optional 
rectangular backgrounds for each. They are 
positioned at a defined location, being dis- 
played at a defined start time and for a defined 
number of video frames. 

The bitmap technique enables support for 
any language, rather than just those supported 
by the decoder, and enables the subtitle author 
complete control over the appearance of the 
characters, including font size and kerning. In 
addition, characters and symbols that are not a 
part of any standard character set can easily be 
used, such as those characters in ideographic 
languages which represent proper names. 

Digital Cable Subtitles 

The subtitle _message() in the SCTE 27 
specification defines subtitle bitmaps associ- 
ated with a program. Timing for the display of 
subtitle text is given as a Presentation Time 
Stamp (PTS) referenced to the program’s pro- 
gram clock (PCR) . 

The subtitle _message() is carried in trans- 
port stream packets with PID = OxlFFB. The 
syntax is: 

Table _ID 

This 8-bit codeword has a value of 0xC6. 

Reserved_bits 

These 2 bits are always “00.” 

Reserved_bits 

These 2 bits are always “11.” 



Section_length 

This 12-bit binary number specifies the 
number of bytes after this field, up to and 
including CRC_32. 

Reserved_bit 

This bit is always “0.” 

Segmentation_overlay_included 

A “1” for this l-bit flag indicates the mes- 
sage includes the segmentation definition. 

Protocolversion 

This 6-bit binary number allows, in the 
future, this message type to carry parameters 
that may be structured fundamentally differ- 
ently from those defined in the current proto- 
col. At present, the subtitle message is defined 
for protocoljversion “00 0000” only. 



Table_extension 

This 16-bit binary number is used to differ- 
entiate between various segmented 
message _body()s that are present simulta- 
neously on the Transport Stream, all delivered 
using subtitle _message()s. This field is present 
segmentation -Overlay -included = “1.” 

Last_segment_number 

This 12-bit binary number indicates the 
segment number of the last segment needed to 
recover the full message. This field is present if 
segmentation -Overlay -included = “1.” 

Segment_number 

This 12-bit binary number indicates which 
part of a (perhaps) multi-part message is 
present. This field is present if 
segmentationj)verlay_included = “1.” 
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IS0_639_language_code 

This 24-bit field contains a 3-character lan- 
guage code. 

Pre_clear_display 

A “1” for this flag indicates that the entire 
display is to be made transparent prior to the 
display of the subtitle text. Otherwise, the sub- 
title text is to be added to the text already on 
the screen (cumulative display) . 

Immediate 

A “1” for this l-bit flag indicates that the 
subtitle is to be displayed immediately upon 
receipt. Otherwise, it should be cued for dis- 
play at the display _in_PTS time. 

Reserved_bit 

This bit is always “0.” 

Display_standard 

This 5-bit codeword specifies the display 
format for which the subtitle was prepared. 

00000 = 720x480 

00001 = 720x576 
00010 = 1280x720 
00011 = 1920x1080 

Display_in_PTS 

When this 32-bit value matches the 32 least 
significant bits of the 33-bit MPEG program 
clock (90 KHz portion) , the subtitle is to be dis- 
played. 

Subtitle_type 

This 4-bit codeword indicates the format of 
the subtitle data block. Currently, only a value 
of “0001” is defined. 

Reserved_bit 

This bit is always “0.” 



Display_duration 

This 11-bit binary number indicates the 
number of video frames, from 1-2000, for 
which the subtitle data is to be displayed. 

Block_length 

This 16-bit binary number indicates the 
number of bytes that follow, excluding the 
CRC and any descriptors. 



Note: The following fields are present when subtitlejype 
= “ 0001 .” 

Reserved_bits 

These 5 bits are always “0 0000.” 

Background_style 

This 1-bit flag specifies the background 
style: 

0 = transparent 

1 = framed 

Outline_style 

This 2-bit codeword specifies the text out- 
line style: 

00 = none 

01 = outline 

10 = drop shadow 

11 = reserved 

Bitmap_Y_component 

This 5-bit binary number specifies the 
value of Y, with a range 0-31, for the text color. 

Bitmap_opaque_enable 

A “1” for this 1-bit flag indicates that the 
text color shall be opaque (no video blend) . A 
“0” indicates a 50% mix with the video is to be 
done. 
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Bitmap_Cr_component 

This 5-bit binary number specifies the 
value of Cr, with a range 0-31, for the text 
color. 

Bitmap_Cb_component 

This 5-bit binary number specifies the 
value of Cb, with a range 0-31, for the text 
color. 

Bitmap_top_H_coordinate 

This 12-bit binary number, with a range of 
0-1919, specifies the horizontal coordinate of 
the left-most pixel of the decompressed bit- 
map. 

Bitmap_top_V_coordinate 

This 12-bit binary number, with a range of 
0-1079, specifies the vertical coordinate of the 
top line of the decompressed bitmap. 

Bitmap_bottom_H_coordinate 

This 12-bit binary number, with a range of 
0-1919, specifies the horizontal coordinate of 
the right-most pixel of the decompressed bit- 
map. 

Bitmap_bottom_V_coordinate 

This 12-bit binary number, with a range of 
0-1079, specifies the vertical coordinate of the 
bottom line of the decompressed bitmap. 



Frame_top_H_coordinate 

This 12-bit binary number, with a range of 
0-1919, specifies the horizontal coordinate of 
the left-most pixel of the frame. This field is 
present if background _style = “1.” 



Frame_top_V_coordinate 

This 12-bit binary number, with a range of 
0-1079, specifies the vertical coordinate of the 
top line of the frame. This field is present if 
background_style = “1.” 

Frame_bottom_H_coordinate 

This 12-bit binary number, with a range of 
0-1919, specifies the horizontal coordinate of 
the right-most pixel of the frame. This field is 
present if background _style = “1.” 

Frame_bottom_V_coordinate 

This 12-bit binary number, with a range of 
0-1079, specifies the vertical coordinate of the 
bottom line of the frame. This field is present if 
background_style = “1.” 

Frame_Y_component 

This 5-bit binary number specifies the 
value of Y, with a range 0-31, for the frame 
color. This field is present if background_style = 
“ 1 .” 

Frame_opaque_enable 

A “1” for this l-bit flag indicates that the 
frame color shall be opaque (no video blend) . 
A “0” indicates a 50% mix with the video is to be 
done. This field is present if background_style = 
“ 1 .” 

Frame_Cr_component 

This 5-bit binary number specifies the 
value of Cr, with a range 0-31, for the frame 
color. This field is present if background_style = 
“ 1 .” 
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Frame_Cb_component 

This 5-bit binary number specifies the 
value of Cb, with a range 0-31, for the frame 
color. This field is present if background_style = 
“ 1 .” 



Reserved_bits 

These 4 bits are always “0000.” This field is 
present if outline_style = “01.” 

Outline_thickness 

This 4-bit binary number, with a range 0- 
15, specifies the text outline thickness. This 
field is present if outline_style = “01.” 

Outline_Y_component 

This 5-bit binary number specifies the 
value of Y, with a range 0-31, for the text out- 
line color. This field is present if outline_style = 
“ 01 .” 

Outline_opaque_enable 

A “1” for this l-bit flag indicates that the 
text outline color shall be opaque (no video 
blend) . A “0” indicates a 50% mix with the video 
is to be done. This field is present if 
outline_style = “01.” 

Outline_Cr_component 

This 5-bit binary number specifies the 
value of Cr, with a range 0-31, for the text out- 
line color. This field is present if outline_style = 
“ 01 .” 

Outline_Cb_component 

This 5-bit binary number specifies the 
value of Cb, with a range 0-31, for the text out- 
line color. This field is present if outline_style = 
“ 01 .” 



Shadow_right 

This 4-bit binary number, with a range 0- 
15, specifies the text right shadow thickness. 
This field is present if outline_style = “10.” 

Shadow_bottom 

This 4-bit binary number, with a range 0- 
15, specifies the text bottom shadow thickness. 
This field is present if outline_style = “10.” 

Shadow_Y_component 

This 5-bit binary number specifies the 
value of Y, with a range 0-31, for the text 
shadow color. This field is present if 
outline_style = “10.” 

Shadow_opaque_enable 

A “1” for this 1-bit flag indicates that the 
text shadow color shall be opaque (no video 
blend) . A “0” indicates a 50% mix with the video 
is to be done. This field is present if 
outline jityle = “10.” 

Shadow_Cr_component 

This 5-bit binary number specifies the 
value of Cr, with a range 0-31, for the text 
shadow color. This field is present if 

outline_style = “10.” 

Shadow_Cb_component 

This 5-bit binary number specifies the 
value of Cb, with a range 0-31, for the text 
shadow color. This field is present if 

outline_style = “10.” 



Reserved_bits 

Each of these 3 bytes has a value of “0000 
0000.” This field is present if outline_style = 

“ 11 .” 
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Bitmap_length 

This 16-bit binary number specifies the 
number of bytes in the following compressed 
bitmap. 

Compressed_bitmap() 



Reserved_bits 

Each of these [n] bytes has a value of 
“0000 0000.” They are only present when 
subtitle Jype “0001.” 



Descriptorjoop 

[n] descriptors may be present in this 
descriptorjoop. 

CRC_32 

32-bit CRC value. 

DVB Subtitles 

DVB subtitles (ETSI EN 300 743) are 
much more complex than digital cable subti- 
tles. 

Subtitle streams are carried in PES packets 
with stream Jd = private stream 1. The timing 
of their display is given as a presentation time 
stamp (PTS) referenced to the program’s pro- 
gram clock (PCR). A subtitle stream conveys 
one or more subtitle services. 

Each subtitle service contains text or graph- 
ics needed to provide subtitles for a particular 
purpose; separate subtitle services may be 
used, for example, to convey subtitles in sev- 
eral languages. Each subtitle service displays 
its information in a sequence of subtitle pages. 

Subtitle pages are overlaid on the video 
image. A subtitle page contains one or more 
subtitle regions. 



Each subtitle region is a rectangular area 
with attributes such as position, size, pixel 
depth and background color. A subtitle region 
is used as the background structure into which 
one or more subtitle objects are placed. 

A subtitle object represents a character, 
word, line of text, entire sentence, logo, or 
icon. 

The PES data field syntax is: 

Data_identifier 

The value for this 8-bit field is 0x20, indicat- 
ing DVB subtitle stream. 

Subtitle_stream_ID 

The value for this 8-bit field is 0x00, indicat- 
ing DVB subtitle stream. 

Sync_byte 

This 8-bit field has a value of “0000 1111.” 
Segmentjype 

This 8-bit field indicates the type of data 
contained in segmentjdata Jield. The following 
segmentjype values are defined for subtitling: 

0x10 = page composition segment 
0x11 = region composition segment 
0x12 = CLUT definition segment 
0x13 = object data segment 
0x14 = display definition segment 
0x80 = end of display set segment 

Page_ID 

This 16-bit binary number identifies the 
subtitle service. Segments with a value match- 
ing composition J>age_ID in the DVB Subtitling 
Descriptor carry data for one subtitle service. 
Segments with a value matching 
ancillary J>age_ID carry data that may be 
shared by multiple subtitle services. 
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A frequent, and often preferred method, is 
to convey distinct services by using different 
streams on separate PIDs. 

Segment_length 

This 16-bit binary number indicates the 
number of bytes contained in 
segmentjdata Jield. 

Segment_data_field 

This is the payload of the segment. Several 
segment types are defined: 

Page composition segment carries informa- 
tion on the page composition, such as list of 
included regions, each region’s position, and 
any time-out information for the page. 

Region composition segment, carries infor- 
mation on the region composition and 
attributes, such as the size, background color, 
the pixel depth, which color lookup table 
(CLUT) to use, and a list of included objects 
with their position within the region. 

CLUT definition segment, contains informa- 
tion on a specific CLUT, such as the colors 
used for a CLUT entries. 



Object data segment, carries information on 
a specific text or graphical object. Object data 
segments for graphical objects contain run- 
length encoded bitmap colors; for text objects, 
a string of character codes is carried. 

End of display set segment, used to signal 
that no more segments need to be received 
before the decoding of the current display set 
can begin. 

End_of_PES_data_field_marker 

An 8-bit field with a value of ‘Till 1111.” 

Enhanced Television 
Programming 

As discussed in Chapter 8, SMPTE 363M 
Transport Type B broadcast data using IP mul- 
ticast binding is delivered as three compo- 
nents: announcements, triggers, and 

resources. Announcements are delivered on a 
known multicast IP address and UDP port, and 
point to triggers and resources that are avail- 
able on specified multicast IP addresses and 
UDP ports. 




DATA STREAM CONTAINS ANNOUNCEMENT, TRIGGER 
AND RESOURCE. MAY CONTAIN MULTIPLE TRIGGERS 
AND RESOURCES BASED ON SOURCE ATTRIBUTES 



Figure 13.26. Announcement, Trigger and Resource Data Carried on a Single PID Stream. 
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ANNOUNCEMENT CARRIED ON SEPARATE DATA 
STREAM FROM TRIGGERS AND RESOURCES. 
MAY CONTAIN MULTIPLE TRIGGERS AND 
RESOURCES BASED ON SOURCE ATTRIBUTES 



Figure 13.27. Announcement Carried Separately from Triggers and Resources. 



PMT FOR PROGRAMS A AND B POINT TO THEIR OWN 
ANNOUNCEMENTS AS WELL AS INCLUDE POINTERS TO 
DATA STREAMS THAT CARRY TRIGGERS AND 
RESOURCES THAT ARE BEING SHARED BY BOTH 
SERVICES 




PMT FOR A 
PMT FOR B 



AUDIO 

VIDEO 

DATA (ANNOUNCEMENT FOR A) 

DATA (TRIGGER / RESOURCE) 
DATA (TRIGGER / RESOURCE) 
DATA (TRIGGER / RESOURCE) 

AUDIO 

VIDEO 

DATA (ANNOUNCEMENT FOR B) 



THESE PID STREAMS CARRY TRIGGERS AND 
RESOURCES WITH DIFFERENT ATTRIBUTES THAT ARE 
SHARED BETWEEN PROGRAM A AND B 



Figure 13.28. Sharing Triggers and Resources 
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The SCTE 42 specification defines how 
these announcements, triggers, and resources 
can be included as part of an MPEG-2 stream, 
as shown in Figure 13.26. The data is associ- 
ated to a program by identifying it within the 
program’s PMT through the use of the MAC 
Address List Descriptor. 

In some cases, it is desirable to carry 
announcements in their own streams with 
unique PIDs; triggers and resources are car- 
ried in separate a stream with a unique PID, as 
illustrated in Figure 13.27. The advantage of 
this technique is that the decoder does not 
have to process IP datagrams, associated with 
triggers and resources, if the application is not 
enabled to receive them. Announcements are 
always processed, but are small in comparison. 
Triggers and resources may also be carried on 
separate streams based on their characteris- 
tics, such as language type, target audience, 
size of resource, maximum bit-rate, etc. 

Figure 13.28 illustrates an extension to Fig- 
ure 13.27. The trigger and resource streams 
are shared between multiple programs. For 
example, two programs may wish to provide 
data from the same weather service. 



Data Broadcasting 

MPEG-2 supports a variety of content dis- 
tribution tools and protocols via the Digital 
Storage Media Command and Control (DSM- 
CC) specification (MPEG-2.6). Applications 
that can take advantage of the DSM-CC tools 
include: 

- Video-on-Demand 

- Data broadcasting 

- Internet access 

- IP multicasting 



At first, DSM-CC simply offered VCR-like 
functions (fast-forward, rewind, pause, etc.) as 
an annex to MPEG-2.1. It was later expanded 
into MPEG-2.6 to handle the selection, access, 
and control of distributed content. As a result 
DSM-CC now encompasses a larger set of 
tools: 

- Network session and resource control 

- Client configuration 

- Downloading of a client 

- Stream control, file access 

- Interactive and broadcast download 

- Data and object carousels 

- Switched digital broadcast channel change 
protocol 

Figure 13.29 illustrates how ATSC and 
DVB data broadcasting use DSM-CC to imple- 
ment a variety of data broadcast features. 

Carousels 

Carousels cyclically repeat their content. If 
a receiver wants to access particular data from 
a carousel, it simply waits for the next time that 
the data is broadcast. 

Data carousels contain unspecified data, so 
a receiver has to know what to do with it when 
it is received. Data carousels are often used for 
downloading new system software. 

Object carousels contain identifiable data 
such as data streams, pictures, trigger events, 
executable applications, etc. along with a direc- 
tory listing all objects in the carousel. Object 
carousels are often used for shopping services, 
electronic program guides (EPG), advertise- 
ments and other interactive functions. Unlike 
data carousels, object carousels can vary the 
repetition rate of individual objects. 
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Figure 13.29. Encapsulation Overview. 
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IP Multicasting over MPEG-2 Transport 

IP multicasting conveys IP datagrams over 
MPEG-2 transport streams based on DMS-CC. 
ATSC (A/92) and DVB (ETSI EN 301 192) use 
slightly different forms of LAN emulation to 
convey packet data. SCTE 42 for digital cable 
systems requires support for both techniques. 

There are two primary protocols: DVB 
Multiprotocol Encapsulation (MPE) and ATSC 
DSM-CC addressable sections. Digital cable 
systems commonly use one or the other, but 
not both simultaneously. However, encapsula- 
tion changes may occur at any time within a 
program. 

The DVB implementation is compliant 
with DSM-CC sections containing private data. 
Table Jd = 0x3E, indicating a DSM-CC section 
containing private data (see Table 13.49), in 
this case, MPE datagram sections. 

The ATSC implementation is also compli- 
ant with DSM-CC sections containing private 
data. The table Jd field is 0x3F, a DSM-CC Sec- 
tion containing addressable sections (see 
Table 13.49). 

Each stream that carries IP multicasting 
data has a streamjype = OxOD associated with 
it within the PMT, indicating that it carries 
DSM-CC sections. 

Formulation of the MAC Address 

IETF RFC 1112 specifies the mapping of an 
IP multicast address to an Ethernet MAC 
address (ATSC calls it deviceld) . The IP multi- 
cast address is mapped into the corresponding 
hardware multicast address by placing the low- 
order 23 bits of the IP multicast address into 
the lower order 23 bits of the MAC address 
01:00:5E:xx:xx:xx (base 16). Bit 23 of the MAC 
address is always “0” per IETF RFC 1700. 



Transporting over MPEG-2 

Figure 13.30 illustrates how the IP data- 
grams are encapsulated and segmented into 
MPEG-2 transport packets. IP datagrams are 
fragmented at the IP layer so that they do not 
exceed the specified Maximum Transfer Unit 
(MTU) size, typically 4080 bytes. 

A single datagram section may span multi- 
ple MPEG-2 packets of the same PID. Also, 
messages may be placed back-to-back in the 
MPEG packet payload. This requires the use 
of pointer Jield (PF) to point to the location of 
the beginning of the next message. 

The MAC Address List Descriptor is used to 
identify data, by multicast MAC group 
addresses, being carried by each stream. 

Data Broadcasting Mechanisms 

There are a wide variety of encapsulation 
protocols used to transport data within an 
MPEG-2 transport stream. IP multicasting has 
been previously discussed. Other common 
techniques include asynchronous data stream- 
ing, synchronous data streaming, and synchro- 
nized data streaming. 

Asynchronous Data Streaming 

Asynchronous data streams, carried in 
DSM-CC sections (ATSC) or PES packets 
(ARIB and DVB), are used for applications 
where the delivery of data is not subject to any 
timing constraints. 

For ARIB and DVB, the PES packet 
streamjd = OxBF for private stream 2. Since 
the maximum size of a PES packet is 64 KB, it 
is segmented into as many 184-byte units as 
needed, to match the transport stream require- 
ments. 

Streamjype = OxOD (ARIB and ATSC) or 
0x06 (DVB) within the PMT. 




Data Broadcasting 731 



Synchronous Data Streaming 

Synchronous data streams, carried as PES 
packet payloads, are used for applications 
requiring continuous streaming of data to a 
receiver at a regular and constant data rate. To 
achieve this, timing information is included 
within the stream. 

Since the maximum size of a PES packet is 
64 KB, it is segmented into as many 184-byte 
units as needed, to match the transport stream 
requirements. 

The PES packet streamed = OxBD for pri- 
vate stream 1, allowing the use of PES header 
fields, including the Presentation Time Stamp 
(PTS) . However, the resolution of the PTS is 
extended from 11.1 us to 74 ns. Streamjype = 
OxOD (ARIB), 0xC2 (ATSC), or 0x06 (DVB) 
within the PMT. 

Synchronized Data Streaming 

Synchronized data streams, carried as PES 
packet payloads, are used for applications 
requiring presentation of data at precise but 
not necessarily regular times. The presenta- 
tion times are usually associated with a video, 
audio, or data stream. 



Since the maximum size of a PES packet is 
64 KB, it is usually segmented into as many 
184-byte units as needed, to match the trans- 
port stream requirements. 

The PES packet stream Jd = OxBD for pri- 
vate stream 1, allowing the use of PES header 
fields, including the Presentation Time Stamp 
(PTS). Streamjype = OxOD (ARIB) or 0x06 
(ATSC and DVB) within the PMT. 

Data Piping 

Data piping is a basic asynchronous trans- 
portation mechanism for data over MPEG-2 
transport streams — data is inserted directly in 
the payload of MPEG-2 transport stream pack- 
ets. Sections, tables, and PES data structures 
are not used. There is no standardized way for 
the splitting and reassembly of the datagrams; 
this is defined by the application. 
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Decoder Considerations 

The video decoder essentially performs 
the inverse function of the encoder. From the 
coded bitstream, it reconstructs the I frames. 
Using I frames, additional coded data, and 
motion vectors, the P and B frames are gener- 
ated. Finally, the frames are output in the 
proper order. 

Figure 13.31 illustrates the block diagram 
of a basic MPEG-2 video decoder. Figures 
13.32 and 13.33 illustrate support for SNR and 
temporal scalability, respectively. 

Audio and Video Synchronization 

An MPEG-2 encoder produces PES pack- 
ets having a different PID (packet identifica- 
tion) for each program. The MPEG-2 decoder 
recognizes only packets with the PID for the 
selected program and ignores the others. 

The MPEG-2 encoder contains a 27 MHz 
oscillator and 33-bit counter, called the STC 
(system time clock). STC is a 33-bit value 
driven by a 90 kHz clock, obtained by dividing 
the 27 MHz clock by 300. It belongs to a partic- 
ular program and is the master clock of the 
video and audio encoders for that program. 

Time Stamps 

After compression, pictures may be sent 
out of sequence due to any bi-directional cod- 
ing that may be present. Each picture has a 
variable amount of data and may have a vari- 
able delay due to multiplexing and transmis- 
sion. In order to keep the audio and video 
synchronized, time stamps are periodically 
sent with a picture. 

The MPEG-2 encoder notes the time of 
occurrence of an input picture or audio block 
(and of the appearance of its coded output) by 
sampling the STC. A constant value equal to 



the sum of encoder and decoder buffer delays 
is added, creating a 33-bit presentation time 
stamp (PTS). A 33-bit decode time stamp 
(DTS) may also be added, indicating the time 
at which the data should be taken from the 
MPEG-2 decoder’s buffer and decoded. DTS 
and PTS are identical, except in the case of pic- 
ture reordering for bi-directional (B) pictures. 

Since presentation times are evenly 
spaced, it is not always necessary to include a 
time stamp (they can be interpolated by the 
decoder), but they must not be more than 700 
ms apart. Lip sync is obtained by incorporating 
time stamps into the headers of both video and 
audio PES packets. 

PTS and DTS 

When B pictures are present, a picture 
may have to be decoded before it is presented, 
so that it can act as a reference for a B picture. 
Although, for example, pictures can be pre- 
sented in the order IBBP, they are transmitted 
in the order IPBB. Consequently, two types of 
time stamps exist. 

DTS indicates the time when a picture 
must be decoded, whereas PTS indicates when 
it must be present at the output of the MPEG-2 
decoder. B pictures are decoded and presented 
simultaneously so they only contain PTS. 
When an IPBB sequence is received, both I 
and P must be decoded before the first B pic- 
ture. An MPEG-2 decoder can only decode one 
picture at a time; therefore the I picture is 
decoded first and stored. While the P picture is 
being decoded, the decoded I picture is output 
so that it can be followed by the B pictures. 

The PTS and DTS flags in the PES packet 
header are set to indicate the presence of PTS 
alone or both PTS and DTS. Audio PES packet 
headers contain only a PTS. Since audio PES 
packets are never transmitted out of sequence, 
they do not contain DTS fields. 
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Figure 13.33. Simplified MPEG-2 Temporal Scalability Decoder Block Diagram. 
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PTS (or DTS) is included in the bitstream 
at intervals not exceeding 700 ms. ATSC fur- 
ther constrains PTS or DTS to be inserted at 
the beginning of each coded picture. 

PCR 

The output of the MPEG-2 encoder is also 
time stamped with STC values, called PCR 
(program clock reference) or SCR (system 
clock reference), used to synchronize the 
MPEG-2 decoder’s STC with the MPEG-2 
encoder’s STC. In a program stream, the clock 
reference is called SCR; in a transport stream, 
the clock reference is called PCR. 

The adaptation field in the transport 
stream packet header is periodically used to 
include the PCR information. MPEG-2 
requires a minimum of ten PCRs per second be 
sent, while DVB specifies a minimum of 25 
PCRs per second. SCR is required to occur at 
least once every 700 ms. 

Synchronization 

Synchronization may be achieved by lock- 
ing the MPEG-2 decoder’s 27 MHz clock to the 
received PCR using a VCXO and PLL, as 
shown in Figure 13.34. This technique ensures 
that the decoder’s receive buffer does not over- 
flow or underflow during long periods of con- 
tinuous operation as a result of the source 
clock being slightly faster or slower than the 
decoder clock. An adjustment range of +100 
ppm is typically required for streaming video 
applications. 



At the MPEG-2 decoder, the VCXO gener- 
ates a nominal 27 MHz clock that drives the 
local PCR counter. This local PCR is compared 
with the PES packet header PCR, resulting in a 
PCR phase error. The phase error is filtered to 
control the VCXO that will bring the local PCR 
count into step with the PES packet header 
PCRs. The discontinuity indicator may reset 
the local PCR count and may optionally be 
used to reduce the filtering to help the MPEG- 
2 decoder lock more quickly to the new timing. 

lip Sync Issues 

Lip sync is an implementation issue that 
has nothing to do with the MPEG-2 standard. 
Assuming that the audio and video are in syn- 
chronization at the MPEG-2 encoder input, 
support of the PCR and PTS provides the tools 
needed to maintain the audio and video timing 
relationship. MPEG-2 decoders, and possibly 
MPEG-2 encoders, with incorrect implementa- 
tions of these tools have been manufactured. 

A representative example of an incorrect 
implementation in MPEG-2 decoders has to do 
with the reading of the PTS. Some MPEG-2 
decoders simply read the PTS when the sta- 
tion is first tuned in, then incorrectly “free- 
wheel” afterwards on the basis of temporal 
reference values, never checking the PTS 
again until the channel is changed or turned 
on again. Others completely ignore the tempo- 
ral reference values and instead look at the 
time stamps. As time goes by, early implemen- 
tation mistakes are being corrected, reducing 
the occurrence of lip sync problems. 
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MPEG-4 and H.264 



MPEG-4 builds upon the success and 
experience of MPEG-2. It is best known for: 



- Lower bit-rates than MPEG-2 (for the same 
quality of video) 

- Use of natural or synthetic objects that can be 
rendered together to make a scene 

- Support for interactivity 

For authors, MPEG-4 enables creating con- 
tent that is more reusable and flexible, with 
better content protection capabilities. 

For consumers, MPEG-4 can offer more 
interactivity and, due to the lower bit-rate over 
MPEG-2, the ability to enjoy content over new 
networks (such as DSL) and mobile products. 

MPEG-4 is an ISO standard (ISO/IEC 
14496) , and currently consists of 19 parts: 



systems 

visual 

audio 

conformance testing 
reference software 
DMIF 



ISO/IEC 14496-1 
ISO/IEC 14496-2 
ISO/IEC 14496-3 
ISO/IEC 14496-4 
ISO/IEC 14496-5 
ISO/IEC 14496-6 
ISO/IEC 14496-7 
ISO/IEC 14496-8 
ISO/IEC 14496-9 
ISO/IEC 14496-10 



reference software 
carriage over IP networks 
reference hardware 
advanced video (H.264) 



scene description 
ISO file format 
IPMP extensions 
MP4 file format 
H.264 file format 



ISO/IEC 14496-11 
ISO/IEC 14496-12 
ISO/IEC 14496-13 
ISO/IEC 14496-14 
ISO/IEC 14496-15 
ISO/IEC 14496-16 
ISO/IEC 14496-17 
ISO/IEC 14496-18 
ISO/IEC 14496-19 



animation extension 
streaming text format 
font compression 
synthesize texture stream 



MPEG-4 provides a standardized way to 
represent audio, video, or still image media 
objects using descriptive elements (instead of 
actual bits of an image, for example) . A media 
object can be natural or synthetic (computer- 
generated) and can be represented indepen- 
dent of its surroundings or background. 

It also describes how to merge multiple 
media objects to create a scene. Rather than 
sending bits of picture, the media objects are 
sent, and the receiver composes the picture. 
This allows: 



- An object to be placed anywhere 

- Geometric transformations on an object 

- Grouping of objects 

- Modifying attributes and transform data 

- Changing the view of a scene dynamically 
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Audio Overview 

MPEG-4 audio supports a wide variety of 
applications, from simple speech to multi-chan- 
nel high-quality audio. 

Audio objects (audio codecs) use specific 
combinations of tools to efficiently represent 
different types of audio objects. Profiles use 
specific combinations of audio object types to 
efficiently service a specific market segment. 
Levels specify size, rate, and complexity limita- 
tions within a profile to ensure interoperability. 

Currently, most solutions support a few of 
the most popular audio codecs (usually AAC- 
LC and HE-AAC) rather than one or more pro- 
files/levels. 

General Audio Object Types 

This category supports a wide range of 
quality, bit-rates, and number of channels. For 
natural audio, MPEG-4 supports the AAC 
(Advanced Audio Coding), BSAC (Bit Sliced 
Arithmetic Coding), and TwinVQ (Transform 
Domain Weighted Interleave Vector Quantiza- 
tion) algorithms. The following audio objects 
are available: 

AAC-Main Objects 

AAC-Main objects add the Perceptual 
Noise Shaping (PNS) tool to MPEG-2 AAC- 
Main. 

AAC-LC Objects 

AAC-LC (Low Complexity) objects add the 
PNS tool to MPEG-2 AAC-LC. There is also an 
Error Resilient version, ER AAC-LC. 



AAC-SSR Objects 

AAC-SSR (Scalable Sampling Rate) objects 
add the PNS tool to MPEG-2 AAC-SSR. 

AAC-LTP Objects 

AAC-LTP (Long Term Predictor) objects 
are similar to AAC-LC objects, with the long 
term predictor replacing the AAC-LC predic- 
tor. This gives the same efficiency with signifi- 
cantly lower implementation cost. There is also 
an Error Resilient version, ER AAC-LTP. 

AAC-Scalable Objects 

AAC-Scalable objects allow a large number 
of scalable combinations. They support only 
mono or 2-channel stereo sound. There is also 
an Error Resilient version, ER AAC-Scalable. 

ERAAC-LD Objects 

Error Resilient AAC-LD (Low Delay) is 
derived from AAC and all the capabilities for 
coding of two or more sound channels are sup- 
ported. They support sample rates up 48 kHz 
and use frame lengths of 512 or 480 samples 
(compared to 1024 or 960 samples used by 
AAC) to enable a maximum algorithmic delay 
of 20 ms. 

ER BSAC Objects 

Error Resilient BSAC objects replace the 
noiseless coding of AAC quantized spectral 
data and the scale factors. One base layer bit- 
stream and many small enhancement layer bit- 
streams are used, enabling real-time 
adjustments to the quality of service. 
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HE-AAC Objects 

HE-AAC (High Efficiency), a combination 
of AAC and Spectral Band Replication (SBR) 
technology, is designed for ultra-low bit-rate 
coding, as low as 32 kbps for stereo. 

TwinVQ Objects 

TwinVQ objects are based on fixed rate 
vector quantization instead of the Huffman 
coding used in AAC. They operate at lower bit- 
rates than AAC, supporting mono and stereo 
sound. There is also an Error Resilient version, 
ER TwinVQ. 

Speech Object Types 

Speech coding can be done using bit-rates 
from 2-24 kbps. Lower bit-rates, such as an 
average of 1.2 kbps, are possible when variable 
rate coding is used. The following audio 
objects are available: 

CELP Objects 

CELP (Code Excited Linear Prediction) 
objects support 8 and 16 kHz sampling rates at 
bit-rates of 4-24 kbps. There is also an Error 
Resilient version, ER CELP. 

HVXC Objects 

HVXC (Harmonic Vector eXcitation Cod- 
ing) objects support 8 kHz mono speech at 
fixed bit-rates of 2-4 kbps (below 2 kbps using 
a variable bit-rate mode) , along with the ability 
to change the pitch and speed during decod- 
ing. There is also an Error Resilient version, 
ER HVXC. 



Synthesized Speech Object Types 

Scalable TTS (Text-to-Speech) objects 
offer a low bit-rate (200-1.2 kbps) phonemic 
representation of speech. Content with narra- 
tion can be easily created without recording 
natural speech. The TTS Interface allows 
speech information to be transmitted in the 
International Phonetic Alphabet (IPA) or in a 
textual (written) form of any language. The 
synthesized speech can also be synchronized 
with a facial animation object. 

Synthesized Audio Object Types 

Synthetic Audio support is provided by a 
Structured Audio Decoder implementation 
that allows the application of score-based con- 
trol information to musical instruments 
described in a special language. The following 
audio objects are available: 

Main Synthetic Objects 

Main Synthetic objects allow the use of the 
all MPEG-4 Structured Audio tools. They sup- 
port synthesis using the Structured Audio 
Orchestra Language (SAOL) music-synthesis 
language and wavetable synthesis using Struc- 
tured Audio Sample-Bank Format (SASBF). 

Wavetable Synthesis Objects 

Wavetable Synthesis objects are a subset 
of Main Synthetic, making use of SASBF and 
MIDI (Musical Instrument Digital Interface) 
tools. They provide relatively simple sampling 
synthesis. 

General MIDI Objects 

General MIDI objects provide interopera- 
bility with existing content. 
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Visual Overview 

MPEG-4 visual is divided into two sections. 
MPEG-4.2 includes the original MPEG-4 video 
codecs discussed in this section. MPEG-4.10 
specifies the “advanced video codec,” also 
known as H.264, and is discussed at the end of 
this chapter. 

The visual specifications are optimized for 
three primary bit-rate ranges: 

- less than 64 kbps 

- 64-384 kbps 

- 0.384-4 Mbps 

For high-quality applications, higher bit- 
rates are possible, using the same tools and bit- 
stream syntax as those used for lower bit-rates. 

With MPEG-4, visual objects (video 
codecs) use specific combinations of tools to 
efficiently represent different types of visual 
objects. Profiles use specific combinations of 
visual object types to efficiently service a spe- 
cific market segment. Levels specify size, rate, 
and complexity limitations within a profile to 
ensure interoperability. 

Currently, most solutions support only a 
couple of the MPEG-4.2 video codecs (usually 
Simple and Advanced Simple) due to silicon 
cost issues. Interest in MPEG-4.2 video codecs 
also dropped dramatically with the introduc- 
tion of the MPEG-4.10 (H.264) and SMPTE 
421M (VC-1) video codecs, which offer about 
2x better performance. 

YCbCr Color Space 

The 4:2:0 YCbCr color space is used for 
most objects. Each component can be repre- 
sented by a number of bits ranging from 4 to 
12 bits, with 8 bits being the most commonly 
used. 



MPEG-4.2 Simple Studio and Core Studio 
objects may use 4:2:2, 4:4:4, 4:2:2:4, and 
4:4:4:4:4:4 YCbCr or RGB sampling options, to 
support the higher picture quality required 
during the editing process. 

Like H.263 and MPEG-2, the MPEG-4.2 
video codecs are also macroblock, block, and 
DCT-based. 

Visual Objects 

Instead of the video frames or pictures 
used in earlier MPEG specifications, MPEG-4 
uses natural and synthetic visual objects. 
Instances of video objects at a given time are 
called visual object planes (VOPs). 

Much like MPEG-2, there are I (intra), P 
(predicted), and B (bi-directional) VOPs. The 
S-VOP is a VOP for a sprite object. The 
S(GMC)-VOP is coded using prediction based 
on global motion compensation from a past ref- 
erence VOP. 

Arbitrarily shaped video objects, as well as 
rectangular objects, may be used. An MPEG-2 
video stream can be a rectangular video object, 
for example. 

Objects may also be scalable, enabling the 
reconstruction of useful video from pieces of a 
total bitstream. This is done by using a base 
layer and one or more enhancement layers. 

Only natural visual object types are dis- 
cussed since they are currently of the most 
interest in the marketplace. 

MPEG-4.2 Natural Visual Object Types 

MPEG-4.2 supports many natural visual 
object types (video codecs) , with several inter- 
esting ones shown in Table 14.1. The more 
common object types are: 
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Tools 


Object Type 


Main 


Core 


Simple 


Advanced 

Simple 


Advanced 
Real Time 
Simple 


Advanced 

Coding 

Efficiency 


Fine 

Granularity 

Scalable 


VOP types 


I, P, B 


I, P, B 


I,P 


I, P, B 


I,P 


I, P, B 


I, P, B 


chroma format 


4:2:0 


4:2:0 


4:2:0 


4:2:0 


4:2:0 


4:2:0 


4:2:0 


interlace 


X 


- 


- 


X 


- 


X 


X 


global motion 
compensation (GMC) 


- 


- 


- 


X 


- 


X 


- 


quarter-pel motion 
compensation (QPEL) 


- 


- 


- 


X 


- 


X 


- 


slice resynchronization 


X 


X 


X 


X 


X 


X 


X 


data partitioning 


X 


X 


X 


X 


X 


X 


X 


reversible VLC 


X 


X 


X 


X 


X 


X 


X 


short header 


X 


X 


X 


X 


X 


X 


X 


method 1 and 2 
quantization 


X 


X 


- 


X 


- 


X 


X 


shape adaptive DCT 


- 


- 


- 


- 


- 


X 


- 


dynamic resolution 
conversion 


- 


- 


- 


- 


X 


X 


- 


NEWPRED 


- 


- 


- 


- 


X 


X 


- 


binary shape 


X 


X 


- 


- 


- 


X 


- 


grey shape 


X 


- 


- 


- 


- 


X 


- 


sprite 


X 


- 


- 


- 


- 


- 


- 


fine granularity 
scalability (FGS) 


- 


- 


- 


- 


- 


- 


X 


FGS temporal 
scalability 


- 


- 


- 


- 


- 


- 


X 



Table 14.1. Available Tools for Common MPEG-4.2 Natural Visual Object Types. 
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Main Objects 

Main objects provide the highest video 
quality. Compared to Core objects, they also 
support grayscale shapes, sprites, and both 
interlaced and progressive content. 

Core Objects 

Core objects use a subset of the tools used 
by Main objects, although B-VOPs are still sup- 
ported. They also support scalability by send- 
ing extra P-YOPs. Binary shapes can include a 
constant transparency but cannot do the vari- 
able transparency offered by grayscale shape 
coding. 

Simple Objects 

Simple objects are low bit-rate, error resil- 
ient, rectangular natural video objects of arbi- 
trary aspect ratio. Simple objects use a subset 
of the tools used by Core objects. 

Advanced Simple Objects 

Advanced Simple objects looks much like 
Simple objects in that only rectangular objects 
are supported, but adds a few tools to make it 
more efficient: B-frames, Vi-pixel motion com- 
pensation (QPEL), and global motion compen- 
sation (CMC). 

Fine Granularity Scalable Objects 

Fine Granularity Scalable objects can use 
up to eight scalable layers so delivery quality 
can easily adapt to transmission and decoding 
circumstances. 



MPEG-4.2 Natural Visual Profiles 

MPEG-4.2 supports many visual profiles 
and levels. Only natural visual profiles (Tables 
14.2 and 4.3) are discussed since they are cur- 
rently of the most interest in the marketplace. 
The more common profiles are: 

Main Profile 

Main profile was created for broadcast 
applications, supporting both progressive and 
interlaced content. It combines highest quality 
video with arbitrarily shaped objects. 

Core Profile 

Core profile is useful for higher quality 
interactive services, combining good quality 
with limited complexity and supporting arbi- 
trary shape objects. Mobile broadcast services 
can also be supported by this profile. 

Simple Profile 

Simple profile was created with low com- 
plexity applications in mind. Primary applica- 
tions are mobile and the Internet. 

Advanced Simple Profile 

Advanced Simple profile provides the abil- 
ity to distribute single-layer frame-based video 
at a wide range of bit-rates. 

Fine Granularity Scalable Profile 

Fine Granularity Scalable profile was cre- 
ated with Internet streaming and wireless mul- 
timedia in mind. 
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MPEG-4.2 Profile 


Supported 

Shapes 


Notes 


Level 


Typical 

Resolution 


Maximum 
Number 
of Objects 


Maximum 

Bit-Rate 


Main 


arbitrary 


additional 
tools and 
functionality 


L4 

L3 

12 


BT.709 

BT.601 

CIF 


32 

32 

16 


38.4 Mbps 
15 Mbps 
2 Mbps 


Core 


arbitrary 


additional 
tools and 
functionality 


12 

LI 


CIF 

QCIF 


16 

4 


2 Mbps 
384 kbps 


Advanced Core 


arbitrary 


higher 

coding 

efficiency 


12 

LI 


CIF 

QCIF 


16 

4 


2 Mbps 
384 kbps 


N-Bit 


arbitrary 




12 


CIF 


16 


2 Mbps 


Simple 


rectangular 




L3 

L2 

LI 


CIF 

CIF 

QCIF 


4 

4 

4 


384 kbps 
128 kbps 
64 kbps 


Advanced Simple 


rectangular 


higher 

coding 

efficiency 


L5 

L4 

L3b 

L3 

L2 

LI 

LO 


BT.601 

352 x 576 

CIF 

CIF 

CIF 

QCIF 

QCIF 


4 

4 

4 

4 

4 

4 

1 


8 Mbps 
3 Mbps 
1.5 Mbps 
768 kbps 
384 kbps 
128 kbps 
128 kbps 


Advanced Real Time 
Simple 


rectangular 


higher 

error 

resilience 


L4 

L3 

L2 

LI 


CIF 

CIF 

CIF 

QCIF 


16 

4 

4 

4 


2 Mbps 
384 kbps 
128 kbps 
64 kbps 


Core Scalable 


arbitrary 


spatial 

and 

temporal 

scalability 


L3 

L2 

LI 


BT.601 

CIF 

CIF 


16 

8 

4 


4 Mbps 
1.5 Mbps 
768 kbps 


Simple Scalable 


rectangular 


spatial 

and 

temporal 

scalability 


L2 

LI 

LO 


CIF 

CIF 

QCIF 


4 

4 

1 


256 kbps 
128 kbps 
128 kbps 


Fine Granularity Scalable 


rectangular 


SNR 

and 

temporal 

scalability 


L5 

L4 

L3 

L2 

LI 

LO 


BT.601 

352 x 576 

CIF 

CIF 

QCIF 

QCIF 


4 

4 

4 

4 

4 

1 


8 Mbps 
3 Mbps 
768 kbps 
384 kbps 
128 kbps 
128 kbps 


Advanced Coding 
Efficiency 


arbitrary 


higher 

coding 

efficiency 


L4 

L3 

L2 

LI 


BT.709 

BT.601 

CIF 

CIF 


32 

32 

16 

4 


38.4 Mbps 
15 Mbps 
2 Mbps 
384 kbps 



Table 14.2a. MPEG-4.2 Natural Vision Profiles and Levels. 
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MPEG-4.2 Profile 


Supported 

Shapes 


Notes 


Level 


Typical 

Resolution 


Maximum 
Number 
of Objects 


Maximum 

Bit-Rate 


Core Studio 

(uses 10-bit pixel data) 


arbitrary 


additional 
tools and 
functionality 


L4 

L3 

12 

LI 


BT.709, 60P, 4:4:4 
BT.709, 301, 4:4:4:4:4:4 
BT.709, 301, 4:4:4 
BT.601, 4:2:2:4 
BT.709, 301, 4:2:2 
BT.601, 4:4:4:4:4:4 
BT.601, 4:2:2:4 
BT.601, 4:4:4 


16 

8 

4 

4 


900 Mbps 
450 Mbps 
300 Mbps 
90 Mbps 


Simple Studio 
(uses 10- or 12-bit 
pixel data) 


arbitrary 




L4 

L3 

L2 

LI 


BT.709, 60P, 4:4:4 
BT.709, 301, 4:4:4:4:4:4 
BT.709, 301, 4:4:4 
BT.709, 301, 4:2:2:4 
BT.709, 301, 4:2:2 
BT.601, 4:4:4:4:4:4 
BT.601, 4:2:4 
BT.601, 4:4:4 


1 

1 

1 

1 


1800 Mbps 
900 Mbps 
600 Mbps 
180 Mbps 



Table 14.2b. MPEG-4.2 Natural Vision Profiles and Levels. 4:4:4:4:4:4 means 
4:4:4 RGB + 3 auxiliary channels. 4:2:2:4 means 4:2:2 YCbCr + 1 auxiliary 
channel. 



MPEG-4.2 
Object Type 


MPEG-4.2 Profile 


Main 


Core 


Simple 


Advanced 

Simple 


Advanced 
Real Time 
Simple 


Advanced 

Coding 

Efficiency 


Fine 

Granularity 

Scalable 


Main 


X 


- 


- 


- 


- 


- 


- 


Core 


X 


X 


- 


- 


- 


X 


- 


N-Bit 


- 


- 


- 


- 


- 


- 


- 


Simple 


X 


X 


X 


X 


X 


X 


X 


Advanced Simple 


- 


- 


- 


X 


- 


- 


X 


Advanced Real Time 
Simple 


- 


- 


- 


- 


X 


- 


- 


Advanced Coding 
Efficiency 


- 


- 


- 


- 


- 


X 


- 


Fine Granularity 
Scalable 


- 


- 


- 


- 


- 


- 


X 



Table 14.3. Objects Supported by Common MPEG-4.2 Profiles. 
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Graphics Element 
of BIFS Tool 


Graphics Profile 


Graphics Tool 
(BIFS node) 


Graphics Profile 


Simple 

2D 


Complete 

2D 


Complete 


Simple 

2D 


Complete 

2D 


Complete 


appearance 


X 


X 


X 


fog 


- 


- 


X 


box 


- 


- 


X 


font style 


- 


X 


X 


bitmap 


X 


X 


X 


indexed face set 


- 


- 


X 


background 


- 


- 


X 


indexed face set 2D 


- 


X 


X 


background 2D 


- 


X 


X 


indexed line set 


- 


- 


X 


circle 


- 


X 


X 


indexed line set 2D 


- 


X 


X 


color 


- 


X 


X 


line properties 


- 


X 


X 


cone 


- 


- 


X 


material 


- 


- 


X 


coordinate 


- 


- 


X 


material 2D 


- 


X 


X 


coordinate 2D 


- 


X 


X 


normal 


- 


- 


X 


curve 2D 


- 


X 


X 


pixel texture 


- 


X 


X 


cylinder 


- 


- 


X 


point light 


- 


- 


X 


directional light 


- 


- 


X 


point set 


- 


- 


X 


elevation grid 


- 


- 


X 


point set 2D 


- 


X 


X 


expression 


- 


- 


X 


rectangle 


- 


X 


X 


extrusion 


- 


- 


X 


shape 


X 


X 


X 


face 


- 


- 


X 


sphere 


- 


- 


X 


face def mesh 


- 


- 


X 


spot light 


- 


- 


X 


face def table 


- 


- 


X 


text 


- 


X 


X 


face def transform 


- 


- 


X 


texture coordinate 


- 


X 


X 


FAP 


- 


- 


X 


texture transform 


- 


X 


X 


FDP 


- 


- 


X 


viseme 


- 


- 


X 


FIT 


- 


- 


X 





Table 14.4. Graphics Elements (BIFS Tools) Supported by MPEG-4 Graphics Profiles. 
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Graphics Overview 

Graphics profiles specify which graphics 
elements of the BIFS tool can be used to build 
a scene. Although it is defined in the Systems 
specification, graphics is really just another 
media profile like audio and video, so it is dis- 
cussed here. 

Four hierarchical graphics profiles are 
defined: Simple 2D, Complete 2D, Complete 
and 3D Audio Graphics. They differ in the 
graphics elements of the BIFS tool to be sup- 
ported by the decoder, as shown in Table 14.4. 

Simple 2D profile provides the basic fea- 
tures needed to place one or more visual 
objects in a scene. 

Complete 2D profile provides 2D graphics 
functions and supports features such as arbi- 
trary 2D graphics and text, possibly in conjunc- 
tion with visual objects. 

Complete profile provides advanced capa- 
bilities such as elevation grids, extrusions, and 
sophisticated lighting. It enables complex vir- 
tual worlds to exhibit a high degree of realism. 

3D Audio Graphics profile may be used to 
define the acoustical properties of the scene 
(geometry, acoustics absorption, diffusion, 
material transparency). This profile is useful 
for applications that do environmental equal- 
ization of the audio signals. 



Visual Layers 

An MPEG-4 visual scene consists of one or 
more video objects. Currently, the most com- 
mon video object is a simple rectangular frame 
of video. 



Each video object may have one or more 
layers to support temporal or spatial scalable 
coding. This enables the reconstruction of 
video in a layered manner, starting with a base 
layer and adding a number of enhancement 
layers. Where a high degree of scalability is 
needed, such as when an image is mapped 
onto a 2D or 3D object, a wavelet transform is 
available. 

The visual bitstream provides a hierarchi- 
cal description of the scene. Each level of hier- 
archy can be accessed through the use of 
unique start codes in the bitstream. 

Visual Object Sequence (VS) 

This is the complete scene which contains 
all the 2D or 3D, natural or synthetic, objects 
and any enhancement layers. 

Video Object (VO) 

A video object corresponds to a particular 
object in the scene. In the most simple case 
this can be a rectangular frame, or it can be an 
arbitrarily shaped object corresponding to an 
object or background of the scene. 

Video Object Layer (VOL) 

Each video object can be encoded in scal- 
able (multi-layer) or nonscalable form (single 
layer), depending on the application, repre- 
sented by the video object layer (VOL). The 
VOL provides support for scalable coding. A 
video object can be encoded using spatial or 
temporal scalability, going from coarse to fine 
resolution. Depending on parameters such as 
available bandwidth, computational power, and 
user preferences, the desired resolution can be 
made available to the decoder. 
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VISUAL OBJECT 
SEQUENCE 
(VS) 



VIDEO OBJECT 
(VO) 



VIDEO OBJECT 
LAYER (VOL) 



GROUP OF 
VOPS (GOV) 



VIDEO OBJECT 
PLANE (VOP) 




LAYER 2 



Figure 14.1. Example MPEG-4 Video Bitstream Logical Structure. 
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There are two types of video object layers, 
the video object layer that provides full MPEG- 
4 functionality, and a reduced functionality 
video object layer, the video object layer with 
short headers. The latter provides bitstream 
compatibility with baseline H.263. 

Group of Video Object Plane (GOV) 

Each video object is sampled in time; each 
time sample of a video object is a video object 
plane. Video object planes can be grouped 
together to form a group of video object 
planes. 

The GOV groups together video object 
planes. GOVs can provide points in the bit- 
stream where video object planes are encoded 
independently from each other, and can thus 
provide random access points into the bit- 
stream. GOVs are optional. 

Video Object Plane (VOP) 

A VOP is a time sample of a video object. 
VOPs can be encoded independently of each 
other, or dependent on each other by using 
motion compensation. A conventional video 
frame can be represented by a VOP with rect- 
angular shape. 

Object Description 
Framework 

Unlike MPEG-2, MPEG-4 does not multi- 
plex multiple elementary streams together into 
a single transport or program stream. 



Data for each object (audio, one layer of 
one visual object, etc.) , scene description infor- 
mation (to declare the spatial-temporal rela- 
tionship of objects), and object control 
information are carried in separate elementary 
streams. Synthetic objects may be generated 
using BIFS to provide the graphics and audio. 
BIFS is more than a scene description lan- 
guage — it integrates natural and synthetic 
objects into the same composition space. 

The object description framework is a set of 
object descriptors used to identify, describe, and 
associate elementary streams to each other, 
and to objects used in the scene description, as 
illustrated in Figure 14.2. 

An initial object descriptor, a derivative of 
the object descriptor, contains two descriptors. 
One descriptor points to the scene description 
[elementary] stream ; the other points to the 
corresponding object descriptor [elementary] 
stream. 

Object Descriptor (OD) Stream 

The object descriptors are transported in a 
dedicated elementary stream, called the object 
descriptor stream. 

The object descriptor effectively associates 
sets of related elementary streams so they are 
seen as a single entity by the decoder. Each 
object descriptor contains other descriptors 
that typically point to one or more elementary 
streams associated to a single node and a sin- 
gle audio or visual object. This allows support 
for multiple alternative streams, such as differ- 
ent languages. 

In addition, an object descriptor can point 
to auxiliary data such as object content informa- 
tion (OCI) and intellectual property rights man- 
agement and protection (IPMP). 
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SCENE 

DESCRIPTION 



AUDIO 

SOURCE 



OBJECT DESCRIPTOR 



ES DESCRIPTOR 



ES DESCRIPTOR 



MOVIE 

TEXTURE 



OBJECT DESCRIPTOR 



ES DESCRIPTOR 



OCI DESCRIPTOR 



IPMP DESCRIPTOR 



INLINE 



OBJECT DESCRIPTOR 



ES DESCRIPTOR 



ES DESCRIPTOR 



ES DESCRIPTOR 



AUDIO 1 ELEMENTARY STREAM (ENGLISH) 



AUDIO 2 ELEMENTARY STREAM (JAPANESE) 

I I I I I 

VISUAL ELEMENTARY STREAM 



OCI ELEMENTARY STREAM FOR VISUAL STREAM 



T 



IPMP ELEMENTARY STREAM FOR VISUAL STREAM 

I l ~ 

OCI ELEMENTARY STREAM FOR SCENE 

I 

OBJECT DESCRIPTOR STREAM 



SCENE DESCRIPTION STREAM 



Figure 14.2. Linking Elementary Streams to a Scene. 
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Object descriptors are not simply present 
in an object descriptor stream one after the 
other. Rather, they are encapsulated in object 
descriptor commands. These commands enable 
object descriptors to be dynamically conveyed, 
updated, or removed at a specific point in time. 
This allows new elementary streams for an 
object to be advertised as they become avail- 
able, or to remove references to elementary 
streams that are no longer available. Updates 
are time stamped to indicate when they are to 
take effect. The time stamp is placed on the 
sync layer, as with any other elementary 
stream. 

Object Content Information (OCI) 

The OCI elementary stream conveys OCI 
events. Each OCI event consists of OCI descrip- 
tors. 

OCI descriptors communicate a number of 
features of the associated object, such as key- 
words, text description of the content, lan- 
guage, parental rating, creation date, authors, 
etc. 

If the OCI information will never change, it 
may instead be conveyed using CCI descrip- 
tors within the object descriptor stream. 

Intellectual Property Management and 
Protection (IPMP) 

The IPMP elementary stream conveys 
IPMP messages to one or more IPMP sys- 
tems. The IPMP system provides intellectual 
property management and content protection 
functions in the receiver. 

If the IPMP information will rarely change, 
it may instead be conveyed using IPMP 
descriptors within the object descriptor 
stream. 



Scene Description 

To assemble a multimedia scene at the 
receiver, it is not sufficient to simply send just 
the multiple streams of data. For example, 
objects may be located in 2D or 3D space, and 
each has its local coordinate system. Objects 
are positioned within a scene by transforming 
each of them to the scene’s coordinate system. 
Therefore, additional data is required for the 
receiver to assemble a meaningful scene for 
the user. This additional data is called scene 
description. 

Scene graph elements (which are BIFS 
tools) describe audiovisual primitives and 
attributes. These elements, and any relation- 
ship between them, form a hierarchical scene 
graph, as illustrated in Figure 14.3. The scene 
graph is not necessarily static; elements may 
be added, deleted, or modified as needed. 

The scene graph profile defines the allow- 
able set of scene graph elements that may be 
used. 

BIFS 

BIFS (Binary Format for Scenes) is used 
to not only describe the scene composition 
information, but also graphical elements. A 
fundamental difference between the BIFS and 
VRMF is that BIFS is a binary format, whereas 
VRMF is a textual format. BIFS supports the 
elements used by VRMF and several that 
VRMF does not, including compressed binary 
format, streaming, streamed animation, 2D 
primitives, enhanced audio, and facial anima- 
tion. 
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Compressed Binary Format 

BIFS supports an efficient binary repre- 
sentation of the scene graph information. The 
coding may be either lossless or lossy. Lossy 
compression is possible due to context knowl- 
edge: if some scene graph data has been 
received, it is possible to anticipate the type 
and format of subsequent data. 

Streaming 

BIFS is designed so that a scene may be 
transmitted as an initial scene, followed by 
modifications to the scene. 



Streamed Animation 

BIFS includes a low-overhead method for 
the continuous animation of changes to numer- 
ical values of the elements in a scene. This pro- 
vides an alternative to the interpolator 
elements supported in BIFS and VRML. 

2D Primitives 

BIFS has native support for 2D scenes to 
support low-complexity, low-cost solutions 
such as traditional television. Rather than parti- 
tioning the world into 2D vs. 3D, BIFS allows 
both 2D and 3D elements in a single scene. 




Figure 14.3. Example MPEG-4 Hierarchical Scene Graph. 
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Enhanced Audio 

BIFS improves audio support through the 
use of an audio scene graph, enabling audio 
sources to be mixed or the generation of sound 
effects. 

Facial Animation 

BIFS exposes the animated face properties 
to the scene level. This enables it to be a full 
member of a scene that can be integrated with 
any other BIFS functionality, similar to other 
audiovisual objects. 



Synchronization of 
Elementary Streams 

Sync Layer 

The sync layer (Figure 14.4) partitions each 
elementary stream into a sequence of access 
units, the smallest entity to which timing infor- 
mation can be associated. It then encodes all 
the relevant properties using a flexible syntax, 
generating SL packets. These SL packets are 
then passed on to a delivery (transport) layer. 
A sequence of SL packets from a single ele- 
mentary stream is called an SL-packetized 
stream. 



MEDIA AWARE 
DELIVERY UNAWARE 



MEDIA UNAWARE 
DELIVERY UNAWARE 



MEDIA UNAWARE 
DELIVERY AWARE 





DELIVERY LAYER 



ELEMENTARY STREAM 
INTERFACE (ESI) 



DMIF APPLICATION 
INTERFACE (DAI) 



Figure 14.4. Relationship Between the MPEG-4 Compression, Sync, and Delivery Layers. 
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Unlike the MPEG-2 PES, the sync layer is 
not a self-contained stream. Instead, it is an 
intermediate format that is mapped to a spe- 
cific delivery layer, such as IP, MPEG-2 trans- 
port stream, etc. For this reason, there is no 
need to include start codes, SL packet lengths, 
etc., in the sync layer — these are already 
included in the various delivery layer proto- 
cols. 

SL packets serve two purposes. First, 
access units can be fragmented in any way dur- 
ing adaptation to a specific delivery layer. Sec- 
ond, it makes sense to have the encoder guide 
the fragmentation when it knows about deliv- 
ery layer characteristics, such as the size of 
the maximum transfer unit (MTU) . 

Synchronization of multiple elementary 
streams is done by conveying object clock ref- 
erence (OCR), decoding time stamps (DTS), 
composition time stamps (CTS) , and clock ref- 
erences within the sync layer. 

The sync layer syntax is flexible in that it 
can be configured individually for each ele- 
mentary stream. For example, low bit-rate 
audio streams may desire time stamps with 
minimum overhead; high bit-rate video 
streams may need very precise time stamps. 

DM IF Application Interface 

Unlike MPEG-2, MPEG-4 supports multi- 
ple simultaneous usage scenarios (local 
retrieval, remote interaction, broadcast, multi- 
cast, etc.) , and multiple simultaneous delivery 
technologies. 

The DMIF (Delivery Multimedia Integra- 
tion Framework) Application Interface, or DAI, 
is the interface that controls the data 
exchanged between a sync layer and a delivery 
layer (Figure 14.4) during both transmission 
and reception. It enables accessing, present- 
ing, and synchronizing MPEG-4 content trans- 



mitted or received using different 
technologies, such as MPEG-2 transport 
stream, IP multicast, RTP, etc., even simulta- 
neously. 

As a result, a specification for control and 
data mapping to a specific delivery or storage 
protocol (also called a payload format specifica- 
tion) has to be done jointly with the organiza- 
tion that manages the delivery layer 
specification. In the example of MPEG-4 trans- 
port over IP, development work was done 
jointly with the Internet Engineering Task 
Force (IETF). 

Multiplexing of Elementary 
Streams 

Delivery of MPEG-4 content is a task that 
is usually dealt with outside the specification. 

However, an analysis of existing delivery 
layers indicated a need for an additional layer 
of multiplexing. The occasionally bursty and 
low bit-rate MPEG-4 streams sometimes have 
to map to a delivery layer that uses fixed 
packet sizes (such as MPEG-2 transport 
streams) or high multiplexing overhead (such 
as RTP/UDP/IP). The potentially large num- 
ber of delivery streams may also have a bur- 
den in terms of management and cost. 

To address this situation, FlexMux, a very 
simple multiplex packet syntax, was defined. It 
allows multiplexing a number of SL-packetized 
streams into a self-contained FlexMux stream 
with low overhead. 

In addition, the specifications for the 
encapsulation of SL-packetized streams into 
common delivery layer protocols, including 
MPEG-2 transport and program streams, IP 
(see Chapter 19), and MPEG-4 file format, 
have already been done. 
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FlexMux 

FlexMux multiplexes one or more SL- 
packetized streams with varying instantaneous 
bit-rates into FlexMux packets that have vari- 
able lengths. 

Identification of SL packets originating 
from different elementary streams is through 
FlexMux channel numbers. Each SL-pack- 
etized stream is mapped into one FlexMux 
channel. FlexMux packets with data from dif- 
ferent SL-packetized streams can therefore be 
arbitrarily interleaved. The sequence of 
FlexMux packets that are interleaved into one 
stream is called a FlexMux stream. 

MPEG-4 Over MPEG-2 

The MPEG-2 PES is the common denomi- 
nator for encapsulating content. MPEG-4 
defines the encapsulation of SL-packetized and 
FlexMux streams within PES packets. 

One SL-packetized stream is mapped to 
one PID or streamed of the MPEG-2 multi- 
plex. Each SL packet is mapped to one PES 
packet. PES and SL packet header redundancy 
is reduced by conveying information only in 
the PES header, removing duplicate data from 
the SL packet header. 

An integer number of FlexMux packets 
can also be conveyed in a PES packet to fur- 
ther reduce multiplex overhead. Several SL- 
packetized streams can then be mapped to one 
MPEG-2 PID or streamjd. The PES header is 
not used at all, since synchronization can be 
done with the time stamp information con- 
veyed in the SL packet headers. 



MP4 File Format 

A file format for the exchange of MPEG-4 
content has also been defined. The file format 
supports metadata in order to allow indexing, 
fast searches, and random access. 

Intellectual Property 
Management and Protection 
(IPMP) 

IPMP, also called digital rights manage- 
ment (DRM), provides an interface and tools, 
rather than a complete system, for implement- 
ing intellectual property rights management. 

The level and type of management and pro- 
tection provided is dependent on the value of 
the content and the business model. For this 
reason, the complete design of the IPMP sys- 
tem is left to application developers. 

The architecture enables both open and 
proprietary solutions to be used, while 
enabling interoperability, supporting the use of 
more than one type of protection (i.e., decryp- 
tion, watermarking, rights management, etc.) 
and supporting the transferring of content 
between devices using a defined inter-device 
message (reflecting the issue of content distri- 
bution over home networks) . 

For protected content, the IPMP tool 
requirements are communicated to the 
decoder before the presentation starts. Tool 
configuration and initialization information is 
conveyed by the IPMP Descriptor or IPMP ele- 
mentary stream. Needed tools can be embed- 
ded, downloaded, or acquired by other means. 
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Control point and ordering sequence infor- 
mation in the IPMP Descriptor allows different 
tools to function at different places in the sys- 
tem. IPMP data, carried in either an IPMP 
Descriptor or IPMP elementary stream, 
includes rights containers, key containers, and 
tool initialization data. 



MPEG-4.10 (H.264) Video 

Previously known as “H.26L,” “JVT,” “JVT 
codec,” “AVC,” and “Advanced Video codec,” 
ITU-T H.264 is one of two new video codecs, 
the other being SMPTE 421M (VC-1), which is 
based on Microsoft Windows Media Video 9 
codec. H.264 is incorporated into the MPEG-4 
specifications as Part 10. 

Rather than a single major advancement, 
H.264 employs many new tools designed to 
improve performance. These include: 

- Support for 8-, 10-, and 12-bit 4:2:2 and 4:4:4 
YCbCr 

- Integer transform 

- UVLC, CAVLC, and CABAC entropy coding 

- Multiple reference frames 

- Intra prediction 

- In-loop deblocking filter 

- SP and SI slices 

- Many new error resilience tools 



Profiles and Levels 

Similar to other video codecs, profiles 
specify the syntax (i.e., algorithms) and levels 
specify various parameters (resolution, frame 
rate, bit-rate, etc.). The various levels are 
described in Table 14.5. 

Baseline Profile (BP) 

Baseline profile is designed for progres- 
sive video such as video conferencing, video- 
over-IP, and mobile applications. Tools used by 
Baseline profile include: 

- 1 and P slice types 

- Wpixel motion compensation 

- UVLC and CAVLC entropy coding 

- Arbitrary slice ordering (ASO) 

- Flexible macroblock ordering (FMO) 

- Redundant slices (RS) 

- 4:2:0 YCbCr format 

Note that Baseline profile is not a subset of 
Main profile. Many solutions implement a sub- 
set of Baseline profile, without ASO or FMO; 
this is a subset of Main profile (and much eas- 
ier to implement) . 

Extended Profile (XP) 

Extended profile is designed for mobile 
and Internet streaming applications. Additional 
tools over Baseline profile include: 



- B, SP, and SI slice types 

- Slice data partitioning 

- Weighted prediction 
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Level 


Maximum 
MB per 
Second 


Maximum 
Frame Size 
(MB) 


Typical 

Frame 

Resolution 


Typical 
Frames per 
Second 


Maximum MVs 
per Two 

Consecutive MBs 


Maximum 

Reference 

Frames 


Maximum 

Bit-Rate 


1 


1,485 


99 


176 x 144 


15 


- 


4 


64 kbps 


1.1 


3,000 


396 


176 x 144 
320 x 240 
352 x 288 


30 

10 

7.5 


- 


9 

3 

3 


192 kbps 


1.2 


6,000 


396 


352 x 288 


15 


- 


6 


384 kbps 


1.3 


11,880 


396 


352 x 288 


30 


- 


6 


768 kbps 


2 


11,880 


396 


352 x 288 


30 


- 


6 


2 Mbps 


2.1 


19,800 


792 


352 x 480 
352 x 576 


30 

25 


- 


6 


4 Mbps 


2.2 


20,250 


1,620 


720 x 480 
720 x 576 


15 

12.5 


- 


5 


4 Mbps 


3 


40,500 


1,620 


720 x 480 
720 x 576 


30 

25 


32 


5 


10 Mbps 


3.1 


108,000 


3,600 


1280 x 720 


30 


16 


5 


14 Mbps 


3.2 


216,000 


5,120 


1280 x 720 


60 


16 


4 


20 Mbps 


4 


245,760 


8,192 


1920 x 1080 
1280 x 720 


30 

60 


16 


4 


20 Mbps 


4.1 


245,760 


8,192 


1920 x 1080 
1280 x 720 


30 

60 


16 


4 


50 Mbps 


4.2 


491,520 


8,192 


1920 x 1080 


60 


16 


4 


50 Mbps 


5 


589,824 


22,080 


2048 x 1024 


72 


16 


5 


135 Mbps 


5.1 


983,040 


36,864 


2048 x 1024 
4096 x 2048 


120 

30 


16 


5 


240 Mbps 



Table 14.5. MPEG-4.10 (H.264) Levels. “MB” = macroblock, “MV” = motion vector. 
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Main Profile (MP) 

Main profile is designed for a wide range 
of broadcast applications. Additional tools over 
Baseline profile include: 

- Interlaced coding 

- B slice type 

- CABAC entropy coding 

- Weighted prediction 

- 4:2:2 and 4:4:4 YCbCr, 10- and 12-bit formats 

- ASO, FMO, and RS are not supported 

High Profiles (HP) 

After the initial specification was com- 
pleted, the Fidelity Range Extension (FRExt) 
amendment was added. This resulted in four 
additional profiles being added to the specifica- 
tion: 

- High Profile (HP): adds support for adaptive 
selection between 4x4 and 8x8 block sizes for 
the luma spatial transform and encoder-specified 
frequency-dependent scaling matrices for trans- 
form coefficients 

- High 10 Profile (HilOP): adds support for 9- 
or 10-bit 4:2:0 YCbCr 

- High 4:2:2 Profile (Hi422P): adds support 
for 4:2:2 YCbCr 

- High 4:4:4 Profile (Hi444P): adds support 
for 11- or 12-bit samples, 4:4:4 YCbCr or RGB, 
residual color transform and predictive lossless 
coding 



Supplemental Enhancement 
Information (SEI) Messages 

Supplemental enhancement information 
(SEI) messages assist in processes related to 
decoding, display, or other purposes. SEI mes- 
sages include: 

- Buffering period 

- Picture timing 

- Pan-scan rectangle 

- Filler payload 

- User data registered 

- User data unregistered 

- Recovery point 

- Decoded reference picture marking repetition 

- Spare picture 

- Scene information 

- Sub-sequence information 

- Sub-sequence layer characteristics 

- Sub-sequence characteristics 

- Full-frame freeze 

- Full-lVame freeze release 

- Full-frame snapshot 

- Progressive refinement segment start 

- Progressive refinement segment end 

- Motion-constrained slice group set 
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The Fidelity Range Extension (FRExt) 
amendment added three new supplemental 
enhancement information (SEI) messages: 

- Film grain characteristics 

- Deblocking filter display preference 

- Stereo video 



Video Coding Layer 

YCbCr Color Space 

H.264 uses the YCbCr color space, sup- 
porting 4:2:0, 4:2:2, and 4:4:4 sampling. The 
4:2:2 and 4:4:4 sampling options increase the 
chroma resolution over 4:2:0, resulting in bet- 
ter picture quality. In addition to 8-bit YCbCr 
data, H.264 supports 10- and 12-bit YCbCr data 
to further improve picture quality. 



The 4:2:0 sampling structure for H.264 is 
shown in Figures 3.8 through 3.10. The 4:2:2 
and 4:4:4 sampling structures are shown in 
Figures 3.2 and 3.3. 

Macroblocks 

With H.264, the partitioning of the 16 x 16 
macroblocks has been extended, as illustrated 
in Figure 14.5. 

Such fine granularity leads to a potentially 
large number of motion vectors per macrob- 
lock (up to 32) and number of blocks that must 
be interpolated (up to 96). To constrain 
encoder/ decoder complexity, there are limits 
on the number of motion vectors used for two 
consecutive macroblocks. 

Error concealment is improved with Flexi- 
ble Macroblock Ordering (FMO), which 
assigns macroblocks to another slice so they 
are transmitted in a non-scanning sequence. 



16 x 16 16 x 8 8 x 16 8 x8 




8x8 8x4 4x8 4x4 




Figure 14.5. Segmentations of H.264 Macroblocks for Motion Compensation. 
Top: segmentation of macroblocks. Bottom: segmentation of 8 x 8 partitions. 
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This reduces the chance that an error will 
affect a large spatial region, and improves 
error concealment by being able to use neigh- 
boring macroblocks for prediction of a missing 
macroblock. 

In-loop De-blocking Filter 

H.264 adds an in-loop deblocking filter. It 
removes artifacts resulting from adjacent mac- 
roblocks having different estimation types 
and/or different quantizer scales. The filter 
also removes artifacts resulting from adjacent 
blocks having different transform/quantiza- 
tion and motion vectors. 

The loop filter uses a content adaptive non- 
linear filter to modify the two samples on either 
side of a block or macroblock boundary. 

Slices 

The slice has greater importance in H.264 
since it is now the basic independent spatial 
element. This prevents an error in one slice 
from affecting other slices. This flexibility 
allows extending the I-, P-, and B-picture types 
down to the slice level, resulting in I-, P-, and B- 
slice types. 

Arbitrary Slice Ordering (ASO) enables 
slices to be transmitted and received out of 
order. This can improve low-delay perfor- 
mance in video conferencing and Internet 
applications. 

Redundant slices are also allowed for addi- 
tional error resilience. This alternative data 
can be used to recover any corrupted macrob- 
locks. 

SP- and Si-Slices 

In addition to I-, P-, and B-slices, H.264 
adds support for SP-slices (Switching P) and 
Si-slices (Switching I). SP-slices use motion 
compensated prediction, taking advantage of 
temporal redundancy to enable reconstruction 



of a slice even when different reference slices 
are used. Si-slices take advantage of spatial 
prediction to enable identically reconstructing 
a corresponding SP-slice. 

Use of S-slices enable efficient bitstream 
switching, random access, fast-forward, and 
error resilience/ recovery, as illustrated in Fig- 
ures 14.6 and 14.7. 

Infra Prediction 

When motion estimation is not efficient, 
intra prediction can be used to eliminate spatial 
redundancies. This technique attempts to pre- 
dict the current block based on adjacent 
blocks. The difference between the predicted 
block and the actual block is then coded. This 
tool is very useful in flat backgrounds where 
spatial redundancies often exist. 

Motion Compensation 

Vi-Pixel Motion Compensation 

Motion compensation accuracy is 
improved from the Wpixel accuracy used by 
most earlier video codecs. H.264 supports the 
same V-i-pixel accuracy that is used on the lat- 
est MPEG-4 video codec. 

Multiple Reference Frames 

H.264 adds supports for multiple reference 
frames. This increases compression by 
improving the prediction process and 
increases error resilience by being able to use 
another reference frame in the event that one 
was lost. 

A single macroblock can use up to 8 refer- 
ence frames (up to 3 for HDTV), with a total 
limit of 16 reference frames used within a 
frame. 

To compensate for the different temporal 
distances between current and reference 
frames, predicted blocks are averaged with 
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configurable weighting parameters. These 
parameters can either be embedded within the 
bitstream or the decoder may implicitly derive 
them from temporal references. 



Unrestricted Motion Search 

This allows for reference frames that are 
outside the picture. Missing data is spatially 
predicted from boundary data. 



P 

SLICES 



SP 

SLICES 



P 

SLICES 



r 






STREAM A 




STREAM B 



Figure 14.6. Using SP Slices to Switch to Another Stream. 



p 

SLICES 



SP P 

SLICES SLICES 




Figure 14.7. Using SP Slices to Fast-Forward. 
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Transform, Scaling, and Quantization 

H.264 uses a simple 4x4 integer trans- 
form. In contrast, older video codecs use an 8 x 
8 DCT that operates on floating-point coeffi- 
cients. An additional 2x2 transform is applied 
to the four CbCr DC coefficients. Intra 16x16 
macroblocks have an additional 4x4 trans- 
form performed for the sixteen Y DC coeffi- 
cents. 

Blocking and ringing artifacts are reduced 
as a result of the smaller block size used by 
H.264. The use of integer coefficients elimi- 
nates rounding errors that cause drifting arti- 
facts common with DCT-based video codecs. 

For quantization, H.264 uses a set of 52 
uniform scalar quantizers, with a step incre- 
ment of about 12.5% between each. 

The quantized coefficients are then 
scanned, from low frequency to high fre- 
quency, using one of two scan orders. 

Entropy Coding 

After quantization and zig-zag scanning, 
H.264 uses two types of entropy encoding: vari- 
able-length coding (VLC) and Context Adap- 
tive Binary Arithmetic Coding (CABAC) . 

For everything but the transform coeffi- 
cients, H.264 uses a single Universal VLC 
(UVLC) table that uses an infinite-extend code- 
word set (Exponential Golomb). Instead of 
multiple VLC tables as used by other video 
codecs, only the mapping to the single UVLC 
table is customized according to statistics. 

For transform coefficients, which consume 
most of the bandwidth, H.264 uses Context- 
Adaptive Variable Length Coding (CAVLC). 
Based upon previously processed data, the 
best VLC table is selected. 



Additional efficiency (5-10%) may be 
achieved by using Context Adaptive Binary 
Arithmetic Coding (CABAC) . CABAC continu- 
ally updates the statistics of incoming data and 
real-time adaptively adjusts the algorithm 
using a process called context modeling. 

Network Abstraction Layer (NAL) 

The NAL facilitates mapping H.264 data to 
a variety of transport layers including: 

- RTP/IP for wired and wireless Internet services 

- File formats such as MP4 

- H.32X for conferencing 

- MPEG-2 systems 

The data is organized into NAL units, pack- 
ets that contain an integer number of bytes. 
The first byte of each NAL unit indicates the 
payload data type and the remaining bytes con- 
tain the payload data. The payload data may be 
interleaved with additional data to prevent a 
start code prefix from being accidentally gen- 
erated. 

When data partitioning is used, each slice 
is divided into three separate partitions, with 
each partition using a specific NAL unit type. 
This enables data partitioning to be used as an 
efficient layering method that separates the 
data into different levels of importance. By par- 
titioning data into different NAL units, it is 
much easier to use different error protection 
for various parts of the data. 
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ATSC 

Digital Television 



The ATSC (Advanced Television Systems 
Committee) digital television (DTV) broadcast 
standard is used in the United States, Canada, 
South Korea, Mexico, and Argentina. 

The three other primary DTV standards 
are DVB (Digital Video Broadcast), ISDB 
(Integrated Services Digital Broadcasting), 
and OpenCable . The basic audio and video 
capabilities are very similar. The major differ- 
ences are the RF modulation schemes and the 
level of definition for non-audio/video services. 
A comparison of the ATSC standards is shown 
in Table 15.1. 

The ATSC standard is actually a group of 
standards: 

A/52 Digital Audio Compression (AC-3 and E- 
AC-3) Standard 

A/53 ATSC Digital Television Standard 

A/57 Content Identification and Labeling for 
ATSC Transport 

A/ 64 Transmission Measurement and Compli- 
ance for Digital Television 



A/65 Program and System Information Protocol 
for Terrestrial Broadcast and Cable 

A/70 Conditional Access System for Terrestrial 
Broadcast 

A/76 Programming Metadata Communication 
Protocol 

A/80 Modulation and Coding Requirements for 
Digital TV (DTV) Applications Over Satel- 
lite 

A/81 Direct-to-Home Satellite Broadcast Stan- 
dard 

A/90 Data Broadcast Standard 

A/92 Delivery of IP Multicast Sessions over 
Data Broadcast Standard 

A/93 Synchronized/Asynchronous Trigger 
Standard 

A/94 Data Application Reference Model 

A/95 Transport Stream File System Standard 

A/96 Interaction Channel Protocols 

A/97 Software Download Data Service 
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A/100 DTV Application Software Environment: 
Level 1 (DASE-1) 

A/101 Advanced Common Application Platform 
(ACAP) 

A/110 Synchronization Standard for Distributed 
Transmission 

The ATSC standard uses an MPEG-2 
transport stream to convey compressed digital 
video, compressed digital audio, and data over 
a single 6 MHz channel. Multiple video 
streams, multiple audio streams, and/or data 
may be present in the MPEG-2 transport 
stream. For example, both HD and SD ver- 
sions of a program may be present, along with 
data, such as a local weather forecast. 

The MPEG-2 transport stream has a maxi- 
mum bit-rate of -19.4 Mbps (6 MHz over-the- 
air channel) or -38.8 Mbps (6 MHz digital 
cable channel) . 

The 19.4 Mbps bit-rate can be used in a 
very flexible manner, trading off the number of 



programs offered versus video quality and res- 
olution. For example, 

(1) HDTV program 

(1) HDTV program + (1) SDTV program + data 

(4) SDTV programs 

In addition to E-VSB (discussed later) 
being added to the specifications to support a 
more robust mode of operation, work is being 
done on A-VSB. 

To better address the mobile market and 
compete with DVB-H and DMB, A-VSB will 
improve dynamic multipath tracking, allow the 
use of layered (hierarchical) modulation, sup- 
port time division multiplexing and support 
frame slicing. To support improved terrestrial 
coverage, A-VSB will also ease synchronization 
of broadcast signal timing of different towers 
in a Single Frequency Network (SFN). 



Parameter 


ATSC-T 


ATSC-C 


ATSC-S 


ATSC-T E-VSB 


(Terrestrial) 


(Cable) 


(Satellite) 


(Terrestrial) 


video compression 


MPEG-2 


MPEG-2, 

MPEG-4.10 (H.264) 


audio compression 


Dolby® Digital 


Dolby® Digital, 
Dolby® Digital Plus 


multiplexing 


MPEG-2 transport stream 


modulation 


8-VSB 


16-VSB 1 


QPSK, 8PSK 


uses ATSC-T 


channel bandwidth 


6 MHz 


6 MHz 


- 


- 



Note: 

1. Most digital cable systems use QAM instead of 16-VSB. 



Table 15.1. Comparison of ATSC Standards. 
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Video Capability 

Although any resolution may be used as 
long as the maximum bit-rate is not exceeded, 
there are several standardized resolutions, 
indicated in Table 15.2. Both interlaced and 
progressive pictures are permitted for most of 
the resolutions. 

Video compression is based on MPEG-2. 
However, there are some minor constraints on 
some of the MPEG-2 parameters, as discussed 
in the MPEG-2 chapter. Support for using 
MPEG-4.10 (H.264) up to HP@L4.0 is being 
added to the specifications. 



Audio Capability 

Audio compression is implemented using 
Dolby® Digital and supports 1-5.1 channels. 

The main audio, or associated audio which 
is a complete service (containing all necessary 
program elements), has a bit-rate <448 kbps 
(384 kbps is typically used). A single channel 
associated service containing a single program 
element has a bit-rate <128 kbps. A two chan- 
nel associated service containing only dialogue 
has a bit-rate <192 kbps. The combined bit-rate 
of a main and associated service which are 
intended to be decoded simultaneously must 
be <576 kbps. 

There are several types of audio services 
defined. 

Main Audio Service: Complete Main 
(CM) 

This type of audio service contains a com- 
plete audio program (dialogue, music, and 
effects) . This is the type of audio service nor- 
mally provided, and may contain 1-5.1 audio 
channels. 



The CM service may be further enhanced 
by using the VI, HI, C, or VO associated ser- 
vices. Audio in multiple languages may be pro- 
vided by supplying multiple CM services, each 
in a different language. 

Main Audio Service: Music and Effects 
(ME) 

This type of audio service contains the 
music and effects of an audio program, but not 
the dialogue. It may contain 1-5.1 audio chan- 
nels. The primary program dialogue (if any 
exists) is supplied by a D service. 

Associated Service: Visually Impaired 
(VI) 

This service typically contains a narrative 
description of the program content. The VI ser- 
vice uses a single audio channel. The simulta- 
neous decoding of both the VI and CM allows 
the visually impaired to enjoy the program. 

Besides providing VI as a single narrative 
channel, it may be provided as a complete pro- 
gram mix containing music, effects, dialogue, 
and narration. In this case, the service may use 
up to 5.1 channels. 

Associated Service: Hearing Impaired 
(HI) 

This service typically contains only dia- 
logue which is intended to be reproduced 
simultaneously with the CM service. In this 
case, HI is a single audio channel. 

Besides providing HI as a single dialogue 
channel, it may be provided as a complete pro- 
gram mix containing music, effects, and dia- 
logue with enhanced intelligibility. In this case, 
the service may use up to 5.1 channels. 
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Active Resolution 
(Y) 


SDTV or HDTV 


Frame Rate 

(p = progressive, i = interlaced) 


23.976p 

24p 


29.97i 

30i 


29.97p 

30p 


59.94p 

60p 


480 x 480 


SDTV 


X 


X 


X 


X 


528 x 480 


X 


X 


X 


X 


544 x 480 


X 


X 


X 


X 


640 x 480 


X 


X 


X 


X 


704/720 x 480 


X 


X 


X 


X 


1280 x 720 


HDTV 


X 




X 


X 


960 x 1080 


X 


X 


X 




1280 x 1080 


X 


X 


X 




1440 x 1080 


X 


X 


X 




1920 x 1080 


X 


X 


X 





Table 15.2. Common Active Resolutions for ATSC Content. 



Associated Service: Dialogue (D) 

This service contains program dialogue 
intended for use with the ME service. 

A complete audio program is formed by 
simultaneously decoding both the D and ME 
services and mixing the D service into the cen- 
ter channel of the ME service. 

If the ME service contains more than two 
audio channels, the D service is monophonic. 
If the ME service contains two channels, the D 
service may also contain two channels. In this 
case, a complete audio program is formed by 
simultaneously decoding the D and ME ser- 
vices, mixing the left channels of the ME and 
D service, and mixing the right channels of the 
D and ME service. The result will be a two- 
channel stereo signal containing music, 
effects, and dialogue. 

Audio in multiple languages may be pro- 
vided by supplying multiple D services (each 
in a different language) along with a single ME 



service. This is more efficient than providing 
multiple CM services. However, in the case of 
more than two audio channels in the ME ser- 
vice, this requires that the dialogue be 
restricted to the center channel. 

Associated Service: Commentary (C) 

This service is similar to the D service, 
except that instead of conveying essential pro- 
gram dialogue, it conveys an optional program 
commentary using a single audio channel. 

In addition, it may be provided as a com- 
plete program mix containing music, effects, 
dialogue and commentary. In this case, the ser- 
vice may use up to 5.1 channels. 

Associated Service: Voice-Over (VO) 

This service is a single-channel service 
intended to be decoded and mixed with the 
ME service. 
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Program and System 
Information Protocol (PSIP) 

Enough bandwidth is available within the 
MPEG-2 transport stream to support several 
low-bandwidth non-television services such as 
program guide, closed captioning, weather 
reports, stock indices, headline news, software 
downloads, pay-per-view information, etc. The 
number of additional non-television services 
(virtual channels) may easily reach ten or 
more. In addition, the number and type of ser- 
vice will constantly be changing. 

To support these non-television services in 
a flexible yet consistent, manner the Program 
and System Information Protocol (PSIP) was 
developed. PSIP is a small collection of hierar- 
chically associated tables (see Figure 15.1 and 
Table 15.3) designed to extend the MPEG-2 
PSI tables. It describes the information for all 
virtual channels carried in a particular MPEG- 
2 transport stream. Additionally, information 
for analog broadcast channels may be incorpo- 
rated. 

Required Tables 

Event I nf ormation Table (EIT) 

There are up to 128 EITs, EIT-0 through 
EIT-127, each of which describes the events or 
TV programs associated with each virtual 
channel listed in the VCT. Each EIT is valid for 
three hours. Since there are up to 128 EITs, up 
to 16 days of programming may be advertised 
in advance. The first four EITs are required 
(the first 24 are recommended) to be present. 

Information provided by the EIT includes 
start time, duration, title, pointer to optional 
descriptive text for the event, advisory data, 
caption service data, audio service descriptor, 
and so on. 



Master Guide Table (MGT) 

This table provides general information 
about the other tables. It defines table sizes, 
version numbers, and packet identifiers 
(PIDs). 

Rating Region Table (RRT) 

This table transmits the rating system, 
commonly referred to as the “V-chip.” 

System Time Table (STT) 

This table serves as a reference for the 
time of day. Receivers use it to maintain the 
correct local time. 

Terrestrial Virtual Channel Table (TVCT) 

This table, also referred to as the VCT 
although there is also a Cable VCT (CVCT) 
and Satellite VCT (SVCT), contains a list of all 
the channels in the transport stream that are 
or will be available, plus their attributes. It may 
also include the broadcaster’s analog channel 
and digital channels in other transport 
streams. 

Attributes for each channel include major/ 
minor channel number, short name, Trans- 
port/Transmission System ID (TSID) that 
uniquely identifies each station, etc. The Ser- 
vice Location Descriptor is used to list the PIDs 
for the video, audio, data, and other related ele- 
mentary streams. 

Optional Tables 

Extended Text Table (ETT) 

For text messages, there can be several 
ETTs, each having its PID defined by the 
MGT. Messages can describe channel informa- 
tion, coming attractions, movie descriptions, 
and so on. 
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Descriptor 


Descriptor 

Tag 


Terrestrial Broadcast Tables 


PMT 


MGT 


VCT 


RRT 


EIT 


ETT 


STT 


DCCT 


DCCSCT 


CAT 


PID 


per 

PAT 


OxlFFB 


OxlFFB 


OxlFFB 


per 

MGT 


per 

MGT 


OxlFFB 


OxlFFB 


OxlFFB 


0x0001 


Table.ID 


0x02 


0xC7 


0xC8 


OxCA 


OxCB 


OxCC 


OxCD 


0xD3 


0xD4 


0x80, 0x81 (ECM) 
0x82 - 0x8F (EMM) 


repetition rate 


400 

ms 


150 

ms 


400 

ms 


1 

min 


0.5 

sec 


1 

min 


1 

sec 


400 

ms 


1 

hour 




AC-3 audio 
stream 


1000 0001 


M 








M 












ATSC CA 


1000 1000 






0 




O 












ATSC private 
information* 


1010 1101 






















CA 


0000 1001 


M 


















M 


caption service 


1000 0110 


M 








M 












component name 


1010 0011 


M 




















content advisory 


1000 0111 


M 








M 












content identifier 


1011 0110 


0 








M 












DCC arriving 
request 


1010 1001 
















M 






DCC departing 
request 


1010 1000 
















M 






enhanced 

signaling 


1011 0010 


M 

PMT-E 




















extended channel 
name 


1010 0000 






M 
















genre 


1010 1011 










M 












redistribution 

control 


1010 1010 


M 








M 












service location 


1010 0001 






M 
















SRM reference 


0000 1001 




















M 


stuffing* 


1000 0000 






















time-shifted 

service 


1010 0010 






M 

















Note : 

1. M = when present, required in this table. 0 = may be present in this table also. * = no restrictions. 



Table 15.3. List of ATSC PSIP Tables, Descriptors, and Descriptor Locations. 
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Figure 15.1. ATSC PSIP Table Relationships. 



Directed Channel Change Table (DCCT) 

The DCCT contains information needed 
for a channel change to be done at a broad- 
caster-specified time. The requested channel 
change may be unconditional or may be based 
upon criteria specified by the viewer. 

Directed Channel Change Selection Code 
Table (DCCSCT) 

The DCCSCT permits a broadcast pro- 
gram categorical classification table to be 
downloaded for use by some Directed Channel 
Change requests. 



Descriptors 

Much like MPEG-2, ATSC uses descrip- 
tors to add new functionality. In addition to var- 
ious MPEG-2 descriptors, one or more of these 
ATSC-specific descriptors may be included 
within the PMT or one or more PSIP tables 
(see Table 15.3) to extend data within the 
tables. A descriptor not recognized by a 
decoder must be ignored by that decoder. This 
enables new descriptors to be implemented 
without affecting receivers that cannot recog- 
nize and process the descriptors. 



Program and System Information Protocol (PSIP) 771 



AC-3 Audio Stream Descriptor 

This ATSC descriptor indicates Dolby® 
Digital or Dolby® Digital Plus audio is present, 
and is discussed in Chapter 13. 

ATSC CA Descriptor 

This ATSC descriptor has a syntax almost 
the same as the MPEG-2 CA descriptor. 

ATSC Private I nf ormation Descriptor 

This ATSC descriptor provides a way to 
carry private information, and is discussed in 
Chapter 13. More than one descriptor may 
appear within a single descriptor. 

Component Name Descriptor 

This ATSC descriptor defines a variable- 
length text-based name for any component of 
the service, and is discussed in Chapter 13. 

Content Advisory Descriptor 

This ATSC descriptor defines the ratings 
for a given program, and is discussed in Chap- 
ter 13. 

Content Identifier Descriptor 

This ATSC descriptor is used to uniquely 
identify content with the ATSC transport. 

DCC Arriving Request Descriptor 

This ATSC descriptor provides instruc- 
tions for the actions to be performed by a 
receiver upon arrival to a newly changed chan- 
nel: 

Display text for at least 10 seconds, or for a 
less amount of time if the viewer issues a “con- 
tinue,” “OK,” or equivalent command. 

Display text indefinitely, or until the viewer 
issues a “continue,” “OK,” or equivalent com- 
mand. 



DCC Departing Request Descriptor 

This ATSC descriptor provides instruc- 
tions for the actions to be performed by a 
receiver prior to leaving a channel: 

Cancel any outstanding things and immedi- 
ately perform the channel change. 

Display text for at least 10 seconds, or for a 
less amount of time if the viewer issues a “con- 
tinue,” “OK,” or equivalent command. 

Display text indefinitely, or until the viewer 
issues a “continue,” “OK,” or equivalent com- 
mand. 

Enhanced Signaling Descriptor 

This ATSC descriptor identifies the terres- 
trial broadcast transmission method of a pro- 
gram element, and is discussed in Chapter 13. 

Extended Channel Name Descriptor 

This ATSC descriptor provides a variable- 
length channel name for the virtual channel. 

Genre Descriptor 

This ATSC descriptor provides genre, pro- 
gram type, or category information for events., 
and may appear in the descriptor 0 loop for the 
given EIT event. It references entries in the 
Categorical Genre Code Assignments Table 
and may include references to expansions to 
that table provided by the DCC Selection 
Code. 

Redistribution Control Descriptor 

This ATSC descriptor conveys any redistri- 
bution control information held by the pro- 
gram rights holder for the content, and is 
discussed in Chapter 13. 
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Service Location Descriptor 

This ATSC descriptor specifies the stream 
type, PID, and language code for each elemen- 
tary stream. It is present in the VCT for each 
active channel. 

SRM Reference Descriptor 

This ATSC descriptor is a specific imple- 
mentation of the MPEG-2 C4 Descriptor dis- 
cussed in Chapter 13. It is used to signal that a 
System Renewability Message is present for 
the System Renewability Message Table 
(SRMT) . It is present in the CAT. 

Time-Shifted Service Descriptor 

This ATSC descriptor links one virtual 
channel with up to 20 other virtual channels 
carrying the same programming, but time- 
shifted. A typical application is for Near Video 
On Demand (NVOD) services. 



E-VSB 

E-VSB, also referred to as “Enhanced 8- 
VSB,” enables ATSC broadcasters to include a 
secondary lower bit-rate program that is more 
robust than typical HDTV programs in low-sig- 
nal conditions. If interference degrades the pri- 
mary HDTV signal, the receiver switches to a 
more robust SDTV version of the same pro- 
gram that has been multiplexed into that trans- 
port stream. For example, within the 19 Mbps 
channel, 14 Mbps could be used for the HD 
program, 4 Mbps for the robust SD program, 
and 1 Mbps for management overhead. 

When E-VSB is not used for the fallback 
audio feature, it may instead be used to provide 
additional services, such as enhanced audio, 
additional audio-only services, or metadata to 
control the mixing of two audio streams. 



Audio Capability 

Audio compression for the enhanced ser- 
vice uses Dolby® Digital or Dolby® Digital Plus 
and supports 1-5.1 channels, with a sample 
rate of 48 kHz and a maximum bit-rate of 448 
kbps. Use of Dolby® Digital is allowed, but use 
of Dolby® Digital Plus is preferred to maintain 
the highest possible bit-rate for the main pro- 
gram. 

Audio service types are restricted to CM, 
VI, HI and C. Each of these audio services 
must contain a complete audio program (dia- 
logue, music, and effects) . 

When E-VSB is not used for the fallback 
audio feature, more than 5.1 channels may be 
used and/or sample rates other than 48 kHz 
may be used. 

Video Capability 

Video compression for the enhanced ser- 
vice will likely use MPEG-4.10 (H.264) up to 
HP@L4.0. Use of MPEG-2 is allowed, but use 
of MPEG-4.10 (H.264) is preferred to maintain 
the highest possible bit-rate for the main pro- 
gram. 

Program and System Information 
Protocol (PSIP-E) 

PSIP-E is program and system information 
transmitted using the E-VSB mode. 

PAT-E: Program Association Table. Syntax 

is the same as for the PAT. 

PMT-E: Program Map Table. Syntax is the 

same as for the PMT. 
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Data Broadcasting 

The ATSC data broadcast standard 
describes various ways to transport data within 
the MPEG-2 transport stream. It can be used 
for a number of applications, such as: 

Delivering declarative data (HTML code) 

Delivering procedural data (Java code) 

Delivering software and images 

Delivering MPEG-4.2, MPEG-4.10 (H.264), or 
SMPTE 421M (VC-I) video streams 

Carouseling MPEG-2 video or MP3 audio files 

The key elements defined within the stan- 
dard are: 

Data services announcement 

Data delivery models such as data piping, data 
streaming, addressable sections, and data down- 
load 

Application signaling 
MPEG-2 systems tools 
Protocols 

A data service must be contained in a vir- 
tual channel, and each virtual channel may 
have at most one data service. One data ser- 
vice may consist of multiple applications, and 
each application may contain multiple data ele- 
ments. 

Data broadcasting is also discussed in 
Chapter 13. 

Data Service Announcements 

Data broadcasting utilizes and extends 
PSIP to announce and find data services in the 



broadcast stream. Data services are 
announced by an event in either the PSIP EIT 
or Data Event Tables (DET). 

Additional tables for data service 
announcements include (additional descriptors 
for data services are not discussed) : 

Data Event Table (DET) 

There are up to 128 DETs, DET-0 through 
DET-127, each of which describes information 
(titles, start times, etc.) for data services on vir- 
tual channels. Each DET is valid for 3 hours. A 
minimum of four DETs (DET-0 through DET- 
3) are required. Any change in a DET shall 
trigger a change in version of the MGT. 

Extended Text Table (ETT) 

The ETT is used to provide detailed 
descriptions of a data event. The syntax is 
mostly identical to ETT used for AV services. 

Long-Term Service Table (LTST) 

The LTST is used to pre-announce data 
events that will occur on a time scale outside 
what the EITs/DETs can support. 

Service Description Framework (SDF) 

Due to the wide range of protocol encapsu- 
lations possible, there is a need to signal which 
encapsulation is used in each data stream. 
While it is possible to use the PMT for signal- 
ing, this approach is not very scalable, working 
only for simple cases. The Service Description 
Framework (SDF) was developed to provide a 
scalable framework. 

Additional tables for the service descrip- 
tion framework include: 
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Data Service Table (DST) 

The DST provides the description of a data 
service comprised of one or more receiver 
applications. It also provides information to 
allow data receivers to associate applications 
with references to data consumed by them. 

Network Resources Table (NRT) 

The NRT provides a list of all network 
resources outside those in the current MPEG- 
2 program or transport stream. 

A data service may use the NRT to get 
data packets or datagrams other than the ones 
published in the Service Location Descriptor of 
the VCT. This includes data elementary 
streams in another program within the same 
transport stream, data elementary streams in 
other transport streams, and bi-directional 
communication channels using other protocols 
such as IP. 

Triggers (Synchronized and 
Asynchronous) 

Support for synchronized and asynchro- 
nous triggers enables the synchronized deliv- 
ery of data modules through the decoupling of 
the timing from the delivery of the data ele- 
ment. It also enables the delivery of events to 
receivers, including application-defined events. 
Triggers are conveyed as data modules in the 
download protocol. 

Triggers carry pointers to objects to be 
activated, pointers to applications that need to 
process pre-loaded data for presentation, or 
additional self-contained user data. 

Synchronized triggers are activated when 
the 90 kHz part of the receiver’s STC matches 
the PTS value specified by the trigger. Asyn- 
chronous triggers are activated as soon as the 
trigger decoding is complete. 



Software Download Data Service 

The Software Download Data Service 
(SDDS) defines the delivery of software. It 
builds on the data service delivery mechanism 
defined in the Data Broadcast Standard. SDDS 
supports the delivery of software, supporting 
applications such as updating firmware, mid- 
dleware, applications, operating system, etc. 

Transport Stream File System 

The Transport Stream File System (TSFS) 
defines the delivery of hierarchical name- 
spaces, directories, and files. It builds on the 
data service delivery mechanism defined in 
the Data Broadcast Standard. TSFS supports 
the transmission of the directory, file, and ser- 
vice gateway objects using the MPEG-2.6 
DSM-CC Data Carousel protocol. 



Application Block Diagrams 

Figure 15.2 illustrates a typical ATSC 
receiver set-top box block diagram. A common 
requirement is the ability to output both high- 
definition and standard-definition versions of a 
program simultaneously. 

Figure 15.3 illustrates a typical ATSC digi- 
tal television block diagram. A common 
requirement is the ability to decode two pro- 
grams simultaneously to support Picture-in- 
Picture (PIP). 
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NTSC 

S-VIDEO 

YPBPR 

HDMI 



5.1-CHANNEL 

AUDIO 

S/PDIF 



Figure 15.2. Typical ATSC Receiver Set-Top Box Block Diagram 
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RF IN 



AUDIO IN 
NTSC 
S-VIDEO 
YPBPR 
HDMI 




TO DISPLAY 



TO POWER 
AMPLIFIERS 
AND SPEAKERS 



Figure 15.3. Typical ATSC Digital Television Block Diagram 
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OpenCable 
Digital Television 



OpenCable™ is a digital cable standard for 
the United States, designed to offer interopera- 
bility between different hardware and software 
suppliers. A subset of the standard is being 
incorporated inside digital televisions. 

The three other primary DTV standards 
are ATSC (Advanced Television Systems Com- 
mittee), DVB (Digital Video Broadcast), and 
ISDB (Integrated Services Digital Broadcast- 
ing). The basic audio and video capabilities are 
very similar. The major differences are the RF 
modulation schemes and the level of definition 
for non-audio/video services. A summary of 
the OpenCable standard is shown in Table 
16 . 1 . 

OpenCable is based on these and addi- 



tional ATSC and SCTE standards: 


A/52 


Digital Audio Compression (AC-3, E- 
AC-3) Standard 


A/53 


ATSC Digital Television Standard 


A/65 


Program and System Information Pro- 
tocol for Terrestrial Broadcast and 
Cable 



A/90 ATSC Data Broadcast Standard 

SCTE 07 Digital Transmission Standard for 
Cable Television 

SCTE 18 Emergency Alert Message for Cable 

SCTE 20 Method for Carriage of Closed Cap- 
tions and Non-Real-Time Sampled 
Video 

SCTE 26 Home Digital Network Interface Speci- 
fication with Copy Protection 

SCTE 40 Digital Cable Network Interface Stan- 
dard 

SCTE 43 Digital Video Systems Characteristics 
Standard for Cable Television 

SCTE 54 Digital Video Service Multiplex and 

Transport System for Cable Television 

SCTE 55 Digital Broadband Delivery System: 
Out of Band Transport 

SCTE 65 Service Information Delivered Out-of- 
Band for Digital Cable Television 

SCTE 80 In-Band Data Broadcast Standard 

including Out-of-Band Announcements 
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OpenCable™ receivers use the following 
four communications channels over the digital 
cable network: 

6 MHz NTSC analog channels. They are 
typically located in the 54-450 MHz range. 

Each channel carries one program. 

6 MHz Forward Application Transport 
(FAT) channels, which carry content via 
MPEG-2 transport streams. They use QAM 
encoding and are typically located in the 450- 
864 MHz range. Each channel can carry multi- 
ple programs. 

Out-of-Band (OOB) Forward Data Chan- 
nels (FDC). They use QPSK modulation and 
are typically located in the 70-130 MHz range, 
spaced between the 6 MHz NTSC analog and/ 
or FAT channels. SCTE 55-1 and SCTE 55-2 
are two alternative implementations. 



Out-of-Band (OOB) Reverse Data Channels 
(RDC) . They use QPSK modulation and are 
typically located in the 5-42 MHz range. SCTE 
55-1, SCTE 55-2, and DOCSIS® provide three 
alternative implementations. 

OpenCable™ receivers obtain content by 
tuning to one of many 6 MHz channels avail- 
able via the cable TV connection. When the 
selected channel is a legacy analog channel, 
the signal is processed using a NTSC audio/ 
video/VBI decoder. 

When the selected channel is a digital 
channel, it is processed by a QAM demodula- 
tor and then a CableCARD for content 
descrambling (conditional access descram- 
bling) . The conditional access descrambling is 
specific to a given cable system and is usually 
proprietary. The CableCARD then rescram- 





TM 


Parameter 


OpenCable 


video compression 


MPEG-2 


audio compression 


Dolby® Digital 


multiplexing 


MPEG-2 transport stream 


modulation 


QAM 


channel bandwidth 


6 MHz 



TM 

Table 16.1. Summary of the OpenCable Standard. 
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bles the content to a common algorithm and 
passes it on to the MPEG-2 decoder. The multi- 
stream CableCARD is capable of handling up 
to six different channels simultaneously, 
enabling picture-in-picture and DVR (digital 
video recording) capabilities. 

When the CableCARD is not inserted, 
the output of the digital tuner’s QAM demodu- 
lator is routed directly to the MPEG-2 decoder. 
However, encrypted content will not be view- 
able. 

OpenCable receivers also obtain control 
information and other data by tuning to the 
OOB FDC channel. Using a dedicated tuner, 
the receiver remains tuned to the OOB FDC to 
receive information continuously. This infor- 
mation is also passed to the CableCARD and 
MPEG-2 decoder for processing. 

The bi-directional OpenCable receiver 
can also transmit data using the OOB RDC. 

The OpenCable™ standard uses an MPEG- 
2 transport stream to convey compressed digi- 
tal video, compressed digital audio, and ancil- 
lary data over a single 6 MHz FAT channel. 
Multiple video streams, multiple audio streams 
and/or data may be present in the MPEG-2 
transport stream. 

The MPEG-2 transport stream has a con- 
stant bit-rate of ~27 Mbps (64-QAM modula- 
tion), -38.8 Mbps (256-QAM), or -44.3 Mbps 
(1024-QAM) . 

The available bit-rate can be used in a very 
flexible manner, trading off the number of pro- 
grams offered versus video quality and resolu- 
tion. For example, if MPEG-2 video, statistical 
multiplexing, and 256-QAM are used, 

(4) HDTV programs 

(2) HDTV programs + (6) SDTV programs + data 

(18) SDTV programs 



Video Capability 

Digital video compression is implemented 
using MPEG-2 and has the same requirements 
as ATSC. There are some minor constraints on 
some of the MPEG-2 parameters, as discussed 
within the MPEG-2 chapter. Support for using 
MPEG-4.10 (H.264) up to HP@L4.0 is being 
added to the specifications. 

Although any resolution may be used as 
long as the maximum bit-rate is not exceeded, 
there are several standardized resolutions, 
indicated in Table 16.2. Both interlaced and 
progressive pictures are permitted for most of 
the resolutions. 

Compliant receivers must also be capable 
of tuning to and decoding analog NTSC sig- 
nals, discussed in Chapter 8. 

Audio Capability 

Digital audio compression is implemented 
using Dolby® Digital and has the same require- 
ments as ATSC. 

Compliant receivers must also be capable 
of decoding the audio portion of analog NTSC 
signals. NTSC audio standards are discussed 
in Chapter 8. 



In-Band System Information 
(SI) 

Enough bandwidth is available within the 
MPEG-2 transport stream to support several 
low-bandwidth non-television services such as 
program guide, closed captioning, weather 
reports, stock indices, headline news, software 
downloads, pay-per-view information, etc. The 
number of additional non-television services 
(virtual channels) may easily reach ten or 
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Active Resolution 
(Y) 


SDTV or HDTV 


Frame Rate 

(p = progressive, i = interlaced) 


23.976p 

24p 


29.97i 

30i 


29.97p 

30p 


59.94p 

60p 


480 x 480 


SDTV 


X 


X 


X 


X 


528 x 480 


X 


X 


X 


X 


544 x 480 


X 


X 


X 


X 


640 x 480 


X 


X 


X 


X 


704 x 480 


X 


X 


X 


X 


1280 x 720 


HDTV 


X 




X 


X 


960 x 1080 


X 


X 


X 




1280 x 1080 


X 


X 


X 




1440 x 1080 


X 


X 


X 




1920 x 1080 


X 


X 


X 





TM 

Table 16.2. Common Active Resolutions for OpenCable Content. 



more. In addition, the number and type of ser- 
vice will be constantly changing. 

To support these non-television services in 
a flexible, yet consistent, manner, System Infor- 
mation (SI) was developed. SI is a small collec- 
tion of hierarchically associated tables (see 
Figure 16.1 and Table 16.3) designed to extend 
the MPEG-2 PSI tables. It describes the infor- 
mation for all virtual channels carried in a par- 
ticular MPEG-2 transport stream. Additionally, 
information for analog broadcast channels may 
be incorporated. 

For in-band SI, OpenCable” pretty much 
follows the ATSC PSIP standard, with some 
extensions. 



Required Tables 

Cable Virtual Channel Table (CVCT) 

This table contains a list of all the channels 
in the transport stream that are or will be avail- 
able plus their attributes. It may also include 
the broadcaster’s analog channel and digital 
channels in other transport streams. 

Attributes for each channel include major/ 
minor channel number, short name, Trans- 
port/Transmission System ID (TSID) that 
uniquely identifies each station, carrier fre- 
quency, modulation mode, etc. The Service 
Location Descriptor is used to list the PIDs for 
the video, audio, data, and other related ele- 
mentary streams. 

ATSC also uses a version of this table 
called the Terrestrial Virtual Channel Table 
(TVCT). 
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Emergency Alert (EA) Table 

This table provides a signaling method 
that enables cable TV operators to send emer- 
gency messages to digital set-top boxes, digital 
television receivers, digital VCRs, etc. These 
devices must be able to store any EA events for 
later use; the start time and duration informa- 
tion are used to delete expired events. 

Typically, transport streams originating 
from terrestrial digital broadcast sources 
located in the same geographic region as the 
cable hub also provide any Emergency Alert 
information within their broadcast. 

Event I nf ormation Table (EIT) 

There are up to 128 EITs, EIT-0 through 
EIT-127, each of which describes the events or 
TV programs associated with each virtual 
channel listed in the CVCT. Each EIT is valid 
for 3 hours. Since there are up to 128 EITs, up 
to 16 days of programming may be advertised 
in advance. The first four EITs are required 
(the first 24 are recommended) to be present. 

Information provided by the EIT includes 
start time, duration, title, pointer to optional 
descriptive text for the event, advisory data, 
caption service data, audio service descriptor, 
etc. ATSC also uses this table. 

Master Guide Table (MGT) 

This table provides general information 
about the other tables. It defines table sizes, 
version numbers, and packet identifiers 
(PIDs). ATSC also uses this table. 



Rating Region Table (RRT) 

This table transmits the rating system, 
commonly referred to as the “V-chip.” ATSC 
also uses this table. 

System Time Table (STT) 

This table serves as a reference for the 
time of day. Receivers use it to maintain the 
correct local time. ATSC also uses this table. 

Optional Tables 

Directed Channel Change Table (DCCT) 

The DCCT contains information needed 
for a channel change to be done at a broad- 
caster-specified time. The requested channel 
change may be unconditional or may be based 
upon criteria specified by the viewer. ATSC 
also optionally uses this table. 

Directed Channel Change Selection Code 
Table (DCCSCT) 

The DCCSCT permits a broadcast pro- 
gram categorical classification table to be 
downloaded for use by some Directed Channel 
Change requests. ATSC also optionally uses 
this table. 

Extended Text Table (ETT) 

For text messages, there can be several 
ETTs, each having its PID defined by the 
MGT. Messages can describe channel informa- 
tion, coming attractions, movie descriptions, 
etc. ATSC also optionally uses this table. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


MGT 


CVCT 


RRT 


EIT 


ETT 


STT 


DCCT 


DCCSCT 


CAT 


PID 


per 

PAT 


OxlFFB 


OxlFFB 


OxlFFB 


per 

MGT 


per 

MGT 


OxlFFB 


OxlFFB 


OxlFFB 


0x0001 


Table.ID 


0x02 


0xC7 


0xC9 


OxCA 


OxCB 


OxCC 


OxCD 


0xD3 


0xD4 


0x80, 0x81 (ECM) 
0x82 - 0x8F (EMM) 


repetition rate 


400 

ms 


150 

ms 


400 

ms 


1 

min 


0.5 

sec 


1 

min 


10 

sec 


400 

ms 


1 

hour 




AC-3 audio 
stream 


1000 0001 


M 




















ATSC CA 


1000 1000 






0 




0 












ATSC private 
information* 


1010 1101 






















CA 


0000 1001 


M 


















M 


caption service 


1000 0110 


M 








M 












component name 


1010 0011 


M 




















content advisory 


1000 0111 


M 








M 












DCC arriving 
request 


1010 1001 
















M 






DCC departing 
request 


1010 1000 
















M 






extended channel 
name 


1010 0000 






M 
















extended video 


1000 0011 


M 




















frame rate 


1000 0010 


M 




















MAC address 
list 


1010 1100 


M 




















redistribution 

control 


1010 1010 


M 








M 












service location 


1010 0001 






M 
















stuffing* 


1000 0000 






















time-shifted 

service 


1010 0010 






M 

















Notes : 

1. PMT: MPEG-2 Program Map Table. CAT: MPEG-2 Conditional Access Table. 

2. M = when present, required in this table. 0 = may be present in this table also. * = no restrictions. 

Table 16.3. List of OpenCable” In-Band SI Tables, Descriptors, and Descriptor Locations. 
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Figure 16.1. OpenCable ” In-Band SI Table Relationships. 



Descriptors 

Much like MPEG-2, OpenCable™ uses 
descriptors to add new functionality. In addi- 
tion to various MPEG-2 descriptors, one or 
more of these OpenCable -specific descriptors 
may be included within the PMT or one or 
more SI tables (see Table 16.3) to extend data 
within the tables. A descriptor not recognized 
by a decoder must be ignored by that decoder. 
This enables new descriptors to be imple- 
mented without affecting receivers that cannot 
recognize and process the descriptors. 



AC-3 Audio Stream Descriptor 

This OpenCable descriptor indicates 
Dolby® Digital audio is present, and is dis- 
cussed in Chapter 13. ATSC also uses this 
descriptor. 

ATSC Private I nf ormation Descriptor 

This OpenCable descriptor provides a 
way to carry private information, and is dis- 
cussed in Chapter 13. More than one descrip- 
tor may appear within a single descriptor. 
ATSC also uses this descriptor. 
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Component Name Descriptors 

These two OpenCable descriptors define 
a variable-length text-based name for any com- 
ponent of the service, and is discussed in 
Chapter 13. ATSC also uses one of these 
descriptors. 

Content Advisory Descriptor 

This OpenCable descriptor defines the 
ratings for a given program, and is discussed in 
Chapter 13. ATSC also uses this descriptor. 

DCC Arriving Request Descriptor 

This OpenCable descriptor provides 
instructions for the actions to be performed by 
a receiver upon arrival to a newly changed 
channel: 

Display text for at least 10 seconds, or for a 
less amount of time if the viewer issues a “con- 
tinue,” “OK,” or equivalent command. 

Display text indefinitely, or until the viewer 
issues a “continue,” “OK,” or equivalent com- 
mand. 

ATSC also uses this descriptor. 

DCC Departing Request Descriptor 

This OpenCable descriptor provides 
instructions for the actions to be performed by 
a receiver prior to leaving a channel: 

Cancel any outstanding things and immedi- 
ately perform the channel change. 

Display text for at least 10 seconds, or for a 
less amount of time if the viewer issues a “con- 
tinue,” “OK,” or equivalent command. 

Display text indefinitely, or until the viewer 
issues a “continue,” “OK,” or equivalent com- 
mand. 

ATSC also uses this descriptor. 



Extended Channel Name Descriptor 

This OpenCable descriptor provides a 
variable-length channel name for the virtual 
channel. ATSC also uses this descriptor. 

Extended Video Descriptor 

This OpenCable descriptor identifies 
some attributes that may be needed for pro- 
cessing, and is discussed in Chapter 13. 

Frame Rate Descriptor 

This OpenCable descriptor identifies the 
video frame rate, and is discussed in Chapter 
13. 

MAC Address List Descriptor 

This OpenCable descriptor is used when 
implementing IP (Internet Protocol) multicast- 
ing over MPEG-2 transport streams, and is dis- 
cussed in Chapter 13. 

Redistribution Control Descriptor 

This OpenCable descriptor conveys any 
redistribution control information held by the 
content rights holder, and is discussed in 
Chapter 13. ATSC also uses this descriptor. 

Service Location Descriptor 

This OpenCable descriptor specifies the 
stream type, PID, and language code for each 
elementary stream. It is present in the CVCT 
for each active channel. ATSC also uses this 
descriptor. 

Time-Shifted Service Descriptor 

This OpenCable descriptor links one vir- 
tual channel with up to 20 other virtual chan- 
nels carrying the same programming, but 
time-shifted. A typical application is for Near 
Video On Demand (NVOD) services. ATSC 
also uses this descriptor. 
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Out-of-Band System 
Information (SI) 

SI data may also be conveyed out-of-band 
(OOB). The CableCARD™ converts the OOB 
SI data, which may or may not be within a com- 
pliant MPEG-2 transport stream, into compli- 
ant table sections, each associated with an 
appropriate PID value. 

Based on the network configuration, OOB 
messaging is implemented either over the 
OOB FDC and OOB RDC channels or over the 
DOCSIS® channel. Which system is to be used 
by the receiver is communicated by the Cable- 
CARD” to the receiver. 

Tables 

Six profiles are defined (Table 16.4) that 
indicate required and optional tables. Adher- 
ence to these profiles is required for compli- 
ance. 

Aggregate Event I nf ormation Table (AEIT) 

This table delivers event title and schedule 
information for supporting an EPG. To reduce 
the total number of PID values used for SI 
data, the format allows instances of table sec- 
tions for different time periods to be associated 
with common PID values. 

Aggregate Extended Text Table (AETT) 

This table contains Extended Text Mes- 
sages (ETM), which may be used to convey 
detailed event descriptions. An ETM is a multi- 
ple string data structure, and is therefore capa- 
ble of conveying a description in several 
different languages. 



Emergency Alert (EA) Table 

This table provides a signaling method 
that enables cable TV operators to send emer- 
gency messages to digital set-top boxes, digital 
television receivers, digital VCRs, and so on. 
These devices must be able to store any EA 
events for later use; the start time and duration 
information are used to delete expired events. 

Long-Form Virtual Channel Table (L-VCT) 

This table is the CVCT transmitted using 
MPEG-2 private sections. 

Master Guide Table (MGT) 

This table provides general information 
about the other tables. It defines table sizes, 
version numbers and packet identifiers (PIDs) . 
ATSC also uses this table. 

Network I nf ormation Table (NIT) 

This table groups a number of transport 
streams together, providing tuning information 
the receiver. 

Network Text Table (NTT) 

This table delivers system-wide multilin- 
gual text strings. 

Rating Region Table (RRT) 

This table transmits the rating system, 
commonly referred to as the “V-chip.” ATSC 
also uses this table. 

Short-Form Virtual Channel Table (S- 
VCT) 

This table delivers portions of the Virtual 
Channel Map (VCM), Defined Channels Map 
(DCM) , and the Inverse Channel Map (ICM) . 
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Profile 


Profile 


Profile 


Profile 


Profile 


Profile 




Table 

ID 


1 


2 


3 


4 


5 


6 


Table 


Baseline 


Revision 

Detection 


Parental 

Advisory 


Standard 

EPG 

Data 


Combination 


SI 

Only 


NIT 


0xC2 














carrier definition subtable 




M 


M 


M 


M 


M 




modulation mode subtable 




M 


M 


M 


M 


M 




NTT 


0xC3 














source name subtable 




0 


0 


0 


M 


M 




Short-form VCT 


0xC4 














virtual channel map 




M 


M 


M 


M 


M 




defined channels map 




M 


M 


M 


M 


M 




inverse channel map 




0 


0 


0 


0 


0 




SIT 


0xC5 


M 


M 


M 


M 


M 


M 


MGT 


0xC7 






M 


M 


M 


M 


RRT 


OxCA 






M 


M 


M 


M 


L-VCT 


0xC9 










M 


M 


A Ell' 


0xD6 








M 


M 


M 


AETT 


0xD7 








0 


0 


0 



Note : 

1. M = when present, required in this table. O = may be present in this table also. 



Table 16.4. Usage of Tables in Various Profiles 
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Profile 


Profile 


Profile 


Profile 


Profile 


Profile 






1 


2 


3 


4 


5 


6 


Descriptor 


Tag 


Baseline 


Revision 

Detection 


Parental 

Advisory 


Standard 

EPG 

Data 


Combination 


SI 

Only 


AC-3 audio 


0x81 








O 


O 


O 


caption service 


0x86 








O 


O 


O 


channel properties 


0x95 








0 


0 




component name 


0xA3 








0 


0 


0 


content advisory 


0x87 






M 


M 


M 


M 


daylight savings time 


0x96 






O 


M 


M 


M 


extended channel name 


OxAO 










O 


O 


revision detection 


0x93 




M 


M 


M 


M 




time shifted service 


0xA2 










O 


0 


two part channel number 


0x94 








O 


O 





Note: 

1. M = required in this profile. O = may be present in this profile. 

Table 16.5. Usage of Descriptors in Various Profiles. 



System Time Table (STT) 

This table serves as a reference for the 
time of day. Receivers use it to maintain the 
correct local time. ATSC also uses this table. 

Descriptors 

Tables 16.5 and 16.6 illustrate the usage of 
descriptors in the profiles and tables, respec- 
tively. 

AC-3 Audio Stream Descriptor 

This descriptor indicates Dolby® Digital 
audio is present, and is discussed in Chapter 
13. ATSC also uses this descriptor. 



Channel Properties Descriptor 

This descriptor enables receivers to 
become aware of various channel aspects. Oth- 
erwise, the receiver must tune the channel and 
self-discover the channel’s aspects. 

Component Name Descriptor 

These two OpenCable descriptors define 
a variable-length text-based name for any com- 
ponent of the service, and are discussed in 
Chapter 13. ATSC also uses one of these 
descriptors. 

Content Advisory Descriptor 

This descriptor defines the ratings for a 
given program, and is discussed in Chapter 13. 
ATSC also uses this descriptor. 
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Descriptor 


Tag 


Table 


PMT 


NIT 


NTT 


S-VCT 


STT 


MGT 


L-VCT 


RRT 


AEIT 


PID 


per 

PAT 


OxlFFC 


OxlFFC 


OxlFFC 


OxlFFC 


OxlFFC 


OxlFFC 


OxlFFC 


per 

MGT 


Table _I I) 


0x02 


0xC2 


0xC3 


0xC4 


0xC5 


0xC7 


0xC9 


OxCA 


0xD6 


AC-3 audio 


0x81 


X 
















X 


caption service 


0x86 


X 
















X 


channel properties 


0x95 








X 












component name 


0xA3 


X 


















content advisory 


0x87 


X 
















X 


daylight savings time 


0x96 










X 










extended channel name 


OxAO 














X 






revision detection 


0x93 




X 


X 


X 












time-shifted service 


0xA2 














X 






two-part channel number 


0x94 








X 













Table 16.6. Usage of Descriptors in Various Tables. 



Daylight Savings Time Descriptor 

This descriptor indicates whether or not 
daylight savings time is currently being 
observed and the time/ day on which the day- 
light savings time transition occurs. 

The receiver must not assume that the lack 
of this descriptor means that daylight savings 
time is not currently in effect. 

Extended Channel Name Descriptor 

This descriptor provides a variable-length 
channel name for the virtual channel. ATSC 
also uses this descriptor. 

Revision Detection Descriptor 

This descriptor indicates if new informa- 
tion is contained in the table section in which it 
appears. It should be the first descriptor in the 
list to minimize processing overhead. 



Time-Shifted Service Descriptor 

This descriptor links one virtual channel 
with up to 20 other virtual channels carrying 
the same programming, but time-shifted. A 
typical application is for Near Video On 
Demand (NVOD) services. ATSC also uses 
this descriptor. 

Two-Part Channel Number Descriptor 

This descriptor may be used to associate a 
two-part channel number (i.e., 10-2) with any 
virtual channel. 
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In-Band Data Broadcasting 

The OpenCable™ in-band data broadcast 
standard describes various ways to transport 
data within the MPEG-2 transport stream. It 
can be used for a number of applications, such 
as: 

Delivering declarative data (HTML code) 

Delivering procedural data (Java code) 

Delivering software and images 

Delivering MPEG-4.2, MPEG-4.10 (H.264), or 
SMPTE 421M (VC-I) video streams 

Carouseling MPEG-2 video or MP3 audio files 

The key elements defined within the stan- 
dard are: 

Data services announcement 

Data delivery models such as data piping, data 
streaming, addressable sections, and data down- 
load 

Application signaling 
MPEG-2 systems tools 
Protocols 

A data service must be contained in a vir- 
tual channel, and each virtual channel may 
have at most one data service. One data ser- 
vice may consist of multiple applications, and 
each application may contain multiple data ele- 
ments. 

For in-band data broadcasting, OpenCa- 
ble™ pretty much follows the ATSC data broad- 
cast standard. The major difference is the 
addition of out-of-band announcements. 

Data broadcasting is also discussed in 
Chapter 13. 



Data Service Announcements 

Data broadcasting utilizes and extends SI 
to announce and find data services in the 
broadcast stream. Data services are 
announced by an event in either the EIT or 
Data Event Table (DET) . 

Additional tables for data service 
announcements include (additional descriptors 
for data services are not discussed) : 

Data Event Table (DET) 

There are up to 128 DETs, DET-0 through 
DET-127, each of which describes information 
(titles, start times, etc.) for data services on vir- 
tual channels. Each DET is valid for 3 hours. A 
minimum of four DETs (DET-0 through DET- 
3) are required. Any change in a DET shall 
trigger a change in version of the MGT. ATSC 
also supports this table. 

Extended Text Table (ETT) 

The ETT is used to provide detailed 
descriptions of a data event. The syntax is 
mostly identical to the ETT used for audio and 
video services. ATSC also supports this table. 

Long-Term Service Table (LTST) 

The LTST is used to pre-announce data 
events that will occur on a time scale outside 
what the EITs/DETs can support. ATSC also 
supports this table. 
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Service Description Framework (SDF) 

Due to the wide range of protocol encapsu- 
lations possible, there is a need to signal which 
encapsulation is used in each data stream. 
While it is possible to use the PMT for signal- 
ing, this approach is not very scalable, working 
only for simple cases. The Service Description 
Framework (SDF) was developed to provide a 
scalable framework. 

Additional tables for the service descrip- 
tion include (additional descriptors for data 
services are not discussed) : 

Aggregate Data Event Table (ADET) 

There are up to 128 ADETs, ADET-0 
through ADET-127. They deliver event title 
and schedule information for implementing an 
Electronic Program Guide (EPG). The pur- 
pose of the ADET is: 

To announce a data service in a virtual 
channel which does not include any audio- 
visual event. 

To allow separate announcement of the 
data service portion of an audio/video/ data 
service or audio/data service in a virtual chan- 
nel. 

The transmission format allows ADET 
table sections to share a common PID value 
since the CableCARD can support only a 
small number of concurrent data flows (each 
associated with one PID value) . Each ADET is 
valid for 3 hours. A minimum of 4 ADETs 
(ADET-0 through ADET-3) are required. Any 
change in an ADET triggers a change in ver- 
sion of the MGT. 

ADET section tables may be delivered to 
the receiver either in-band (via MPEG-2 trans- 
port streams) or out-of-band. 



Data Service Table (DST) 

The DST provides the description of a data 
service comprised of one or more receiver 
applications. It also provides information to 
allow data receivers to associate applications 
with references to data consumed by them. 
ATSC also supports this table. 

Network Resources Table (NRT) 

The NRT provides a list of all network 
resources outside those in the current MPEG- 
2 program or transport stream. 

A data service may use the NRT to get 
data packets or datagrams other than the ones 
published in the Service Location Descriptor of 
the CVCT. This includes data elementary 
streams in another program within the same 
transport stream, data elementary streams in 
other transport streams, and bi-directional 
communication channels using other protocols 
such as IP. ATSC also supports this table. 



Conditional Access 

The conditional access (CA) scheme used 
by OpenCable is similar to Multicrypt and 
the DVB Common Interface discussed in 
Chapter 17. OpenCable ' calls their CA module 
CableCARD and it is also based on the EIA- 
679 NRSS-B interface (PCMCIA or PC Card 
form factor) . Two major additions over DVB’s 
solution are the ability to support up to 6 simul- 
taneous streams (requires the multi-stream 
CableCARD”) and a DFAST-encrypted inter- 
face between the CableCARD” output and 
the MPEG-2 decoder input. 

A downloadable conditional access sys- 
tem (DCAS) will shortly be used, eliminating 
the need for using a CableCARD . 




792 Chapter 16: OpenCable™ Digital Television 



Related Technologies 

In addition to OpenCable™, CableLabs® 
has developed, and is continually developing, a 
wide variety of cable-related standards. 

DOCSIS !® 

DOCSIS® (Data Over Cable Service Inter- 
face Specification) defines interface require- 
ments for cable modems. 

PacketCable 

PacketCable ” defines a common platform 
to deliver real-time multimedia services. Built 
on top of the DOCSIS®, PacketCable ” uses 
Internet Protocol (IP) technology. 



Application Block Diagrams 

Figure 16.2 illustrates an OpenCable™ set- 
top box. Part of the requirements is the ability 
to output both high-definition and standard- 
definition versions of HD content simulta- 
neously. 

Figure 16.3 illustrates the incorporation of 
an one-way OpenCable receiver into a digital 
television. 
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CABLE 




TM 

Figure 16.2. OpenCable Receiver Set-Top Box Block Diagram. 
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CABLE 



ATSC 



AUDIO IN 
NTSC 
S-VIDEO 
YPBPR 
HDMI 




TO DISPLAY 



TO POWER 
AMPLIFIERS 
AND SPEAKERS 



Tiw 

Figure 16.3. OpenCable Receiver Inside a Digital Television 
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DVB 

Digital Television 



The DVB (Digital Video Broadcast) digital 
television (DTV) broadcast standard is used in 
most regions except the United States, Can- 
ada, South Korea, Taiwan, Brazil, and Argen- 
tina. 

The three other primary DTV standards 
are ISDB (Integrated Services Digital Broad- 
casting) , ATSC (Advanced Television Systems 
Committee) , and Open Cable . The basic audio 
and video capabilities are very similar. The 
major differences are the RF modulation 
schemes and the level of definition for non- 
audio/video services. A comparison of the 
DVB standards is shown in Table 17.1. 

The DVB standard is actually a group of 
ETSI standards: 

EN 300 421 DVB-S: Framing Structure, Channel 
Coding and Modulation for 11/12 
GHz Satellite Services 

EN 300 429 DVB-C: Framing Structure, Channel 
Coding and Modulation for Cable 
Systems 



EN 300 468 Specification for Service Informa- 
tion (SI) in DVB Systems 

EN 300 472 Specification for Conveying ITU-R 
System B Teletext in DVB Bit- 
streams 

EN 300 743 Subtitling Systems 

EN 300 744 DVB-T: Framing Structure, Channel 
Coding and Modulation for Digital 
Terrestrial Television 

EN 301 192 DVB Specification for Data Broad- 
casting 

EN 301 775 Specification for the Carriage of Ver- 
tical Blanking Information (VBI) 
data in DVB Bitstreams 



EN 302 304 DVB-H: Transmission System for 
Handheld Terminals 

EN 302 307 DVB-S2: Second generation framing 
structure, channel coding and mod- 
ulation systems for Broadcasting, 
Interactive Services, News Gather- 
ing and other broadband satellite 
applications 
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ES 200 800 DVB Interaction Channel for Cable 
TV Distribution Systems (CATV) 



TR 101 211 Guidelines on Implementation and 
Usage of Service Information (SI) 



ETS 300 801 Interaction Channel through Public 
Switched Telecommunications Net- 
work (PSTN) /Integrated Services 
Digital Networks (ISDN) 



TS 101 154 Implementation Guidelines for the 
use of Video and Audio Coding in 
Broadcasting Applications Based on 
the MPEG-2 Transport Stream 



ETS 300 802 Network-independent Protocols for 
DVB Interactive Services 



TS 101 699 Extensions to the Common Inter- 
face Specification 



TR 101 190 Implementation Guidelines for DVB 
Terrestrial Services; Transmission 
Aspects 



TS 102 470 IP Datacast over DVB-H: Program 
Specific Information (PSI) /Service 
Information (SI) 



TR 101 194 Guidelines for Implementation and 
Usage of the Specification of Net- 
work Independent Protocols for 
DVB Interactive Services 

TR 101 200 A Guideline for the Use of DVB 
Specifications and Standards 



TS 102 472 IP Datacast over DVB-H: Content 
Delivery Protocols 

EN 50221 Common Interface Specification for 
Conditional Access and other Digi- 
tal Video Broadcasting Decoder 
Applications 



TR 101 202 Implementation Guidelines for Data 
Broadcasting 



ITR 289 Support for Use of Scrambling and 
Conditional Access (CA) within Dig- 
ital Broadcasting Systems 



Parameter 


DVB-T 


DVB-C 


DVB-S/-S2 


DVB-H 


DVB-SH 


(Terrestrial) 


(Cable) 


(Satellite) 


(Handheld) 


(Handheld) 


video compression 


MPEG-2, MPEG-4.10 (H.264), SMPTE 421M (VC-1) 


MPEG-4.10 (H.264), 
SMPTE 42 1M (VC-1) 


audio compression 


MPEG, Dolby® Digital, Dolby® Digital Plus, 
DTS®, MPEG-4 AAC, MPEG-4 HE-AAC vl/v2 


MPEG-4 AAC, 
MPEG-4 HE-AAC vl/v2, 
AMR-WB+ 


multiplexing 


MPEG-2 transport stream 


RTP-encapsulated 
MPEG-2 transport stream 


modulation 


COFDM 


QAM 


QPSK 


uses DVB-T 


uses DVB-S 


channel bandwidth 


6, 7, or 8 MHz 


6, 7, or 8 MHz 


- 


6, 7, or 8 MHz 


- 



Table 17.1. Comparison of DVB Standards. 
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DVB uses a MPEG-2 transport stream to 
convey compressed digital video, compressed 
digital audio, and data over a 6, 7 or 8 MHz 
channel. Multiple video streams, multiple 
audio streams, and/ or data may be present in 
the MPEG-2 transport stream. 

The MPEG-2 transport stream has a maxi- 
mum bit-rate of -24.1 Mbps (8 MHz DVB-T) or 
-51 Mbps (8 MHz 256-QAM DVB-C). DVB-S 
bit-rates are dependent on the transponder 
bandwidth and code rates used, and can 
approach 54 Mbps (DVB-S2 offers a 25-35% 
bit-rate capacity gain over DVB-S) . The bit-rate 
can be used in a very flexible manner, trading 
off the number of programs offered versus 
video quality and resolution. 

DVB-H and DVB-SH for mobile applica- 
tions use IP datacasting within DVB-T and 
DVB-S, respectively. RTP-encapsulated MPEG- 
2 transport streams are used. 

Second generation versions of DVB-T and 
DVB-C (called DVB-T2 and DVB-C2, respec- 
tively) are being investigated. 



Video Capability 

Although any resolution may be used as 
long as the maximum bit-rate is not exceeded, 
there are several standardized resolutions, 
indicated in Table 17.2. Both interlaced and 
progressive pictures are permitted for most of 
the resolutions. 

DVB-T, DVB-C, DVB-S, and DVB-S2 sup- 
port MPEG-2 (MP@ML, MP@HL), MPEG- 
4.10 (MP@L3, HP@L4), and SMPTE 421M 
(AP@L1, AP@L3) video. 

DVB-IP (“DVB over IP”, used by DVB-H, 
DVB-SH and DVB-IPTV) adds additional sup- 
port for MPEG-4.10 (BP@Llb, BP@L1.2, 
BP@L2) and SMPTE 421M (SP@LL, SP@ML, 
AP@L0) video. 



Audio Capability 

DVB-T, DVB-C, DVB-S, and DVB-S2 sup- 
port MPEG-1 Layer II, MPEG-2 BC multi-chan- 
nel Layer II, Dolby® Digital, Dolby® Digital 
Plus, DTS®, MPEG-4 AAC, and MPEG-4 HE- 
AAC vl/v2 audio. 

DVB-IP (“DVB over IP”, used by DVB-H, 
DVB-SH and DVB-IPTV) adds additional sup- 
port for AMR-WB+ audio. 



System Information (SI) 

ETSI EN 300 468 specifies the Service 
Information (SI) data which forms a part of 
DVB bitstreams. SI is a small collection of hier- 
archically associated tables (see Figure 17.1 
and Table 17.3) designed to extend the MPEG- 
2 PSI tables. It provides information on what is 
available on other transport streams and even 
other networks. The method of information 
presentation to the user is not specified, allow- 
ing receiver manufacturers to choose appropri- 
ate presentation methods. 

Required Tables 

Event I nf ormation Table (EIT) 

There are up to 128 EITs, EIT-0 through 
EIT-127, each of which describes the events or 
TV programs associated with each channel. 
Each EIT is valid for 3 hours. Since there are 
up to 128 EITs, up to 16 days of programming 
may be advertised in advance. The first four 
EITs are required (the first 24 are recom- 
mended) to be present. 

Information provided by the IET includes 
start time, duration, title, pointer to optional 
descriptive text for the event, advisory data, 
caption service data, audio service descriptor, 
etc. 
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Active Resolution 
(Y) 


SDTV or HDTV 


Frame Rate 

(p = progressive, i = interlaced) 


23.976p 

24p 


25i 


29.97i 

30i 


25p 


29.97p 

30p 


50p 


59.94p 

60p 


480 x 480 


SDTV 


X 




X 




X 




X 


480 x 576 


X 


X 




X 




X 




544 x 480 


X 




X 




X 




X 


544 x 576 


X 


X 




X 




X 




704 x 480 


X 




X 




X 




X 


704 x 576 


X 


X 




X 




X 




1280 x 720 


HDTV 


X 






X 


X 


X 


X 


960 x 1080 


X 


X 


X 


X 


X 






1280 x 1080 


X 


X 


X 


X 


X 






1440 x 1080 


X 


X 


X 


X 


X 






1920 x 1080 


X 


X 


X 


X 


X 







Table 17.2. Common Active Resolutions for DVB Digital Television. 



Network I nf ormation Table (NIT) 

The NIT provides information about the 
physical network, including any grouping of 
transport streams and the relevant tuning 
information. It can be used during receiver set- 
up and the relevant tuning information stored 
in non-volatile memory. The NIT can also be 
used to signal changes of tuning information. 

Service Description Table (SDT) 

The SDT describes the available services, 
such as the service names, the service provid- 
ers, etc. 

Time and Date Table (TDT) 

The TDT contains the actual UTC-time- 
coded as Modified Julian Date (MJD). Receiv- 
ers can use it to maintain the correct local 
time. 



Optional Tables 

Bouquet Association Table (BAT) 

The BAT provides information regarding 
bouquets (groups of services that may traverse 
the network boundary). Along with the name 
of the bouquet, it provides a list of services for 
each bouquet. 

Discontinuity Information Table (DIT) 

The DIT is present at transition points 
where the SI information is discontinuous. The 
use of this table is restricted to partial trans- 
port streams; they are not used in broadcasts. 

IP/MAC Notification Table (INT) 

The INT is used to signal the availability 
and location of IP streams in DYB networks. 
There may be more than one INT. 
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MPEG-2 

DEFINED 

TABLES 



DVB 

MANDATORY 

TABLES 



DVB 

OPTIONAL 

TABLES 



PAT 

PID = 0000 



CAT 

PID = 0001 



NIT 

PID = 0010 
ACTUAL NETWORK 



PMT 



TSDT 

PID = 0002 



SDT 

PID = 0011 
ACTUAL 

TRANSPORT STREAM 



EIT 

PID = 0012 
ACTUAL 

TRANSPORT STREAM 
PRESENT / FOLLOWING 



NIT 

PID = 0010 
OTHER NETWORK 



BAT 

PID = 0011 

BOUQUET 

ASSOCIATION 



SDT 

PID = 0011 
OTHER 

TRANSPORT STREAM 



EIT 




EIT 


PID = 0012 




PID = 0012 


ACTUAL 




OTHER 


TRANSPORT STREAM 




TRANSPORT STREAM 


SCHEDULE 




PRESENT / FOLLOWING 



SCHEDULE 



TDT 

PID = 0014 
TIME AND DATE 



RST 

PID = 0013 
RUNNING STATUS 



TOT 

PID = 0014 
TIME OFFSET 



Figure 17.1. DVB SI Table Relationships 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


SIT 


PID 


per 

PAT 


0x0010 


0x0011 


0x0011 


0x0012 


0x0014 


OxOOlF 


TableJD 


0x02 


0x40 

0x41 


0x4A 


0x42 

0x46 


0x4E- 

0x6F 


0x73 


0x7F 


repetition rate 


100 

ms 


10 

sec 


10 

sec 


2-10 

sec 


2-10 

sec 


30 

sec 


30 

sec 


AAC 


0111 1100 


X 














AC-3 


0110 1010 


X 














adaptation field 
data 


0111 0000 


X 














application 

signaling 


01101111 


X 














ancillary data 


0110 1011 


X 














announcement 

support 


0110 1110 








X 








bouquet name 


0100 0111 






X 


X 






X 


cable delivery 
system 


0100 0100 




X 












CA identifier 


0101 0011 






X 


X 


X 




X 


cell frequency 
link 


0110 1101 




X 












cell list 


0110 1100 




X 












component 


0101 0000 








X 


X 




X 


content 


0101 0100 










X 




X 


country 

availability 


0100 1001 






X 


X 






X 


data broadcast 


0110 0100 








X 


X 




X 


data broadcast ID 


0110 0110 


X 














DSNG 


0110 1000 
















DTS audio 


0111 1011 


X 















Notes : 

1. PMT: MPEG-2 Program Map Table. 

2. SIT only present in partial transport streams. 



Table 17.3a. List of DVB SI Tables, Descriptors, and Descriptor Locations. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


SIT 


enhanced AC-3 


0111 1010 


X 














extended event 


0100 1110 










X 




X 


extension 


0111 1111 


X 


X 


X 


X 


X 


X 


X 


frequency list 


0110 0010 




X 












linkage 


0100 1010 




X 


X 


X 


X 




X 


local time offset 


0101 1000 












X 




mosaic 


0101 0001 


X 






X 






X 


multilingual 
bouquet name 


0101 1100 






X 










multilingual 

component 


0101 1110 










X 




X 


multilingual 
network name 


0101 1011 




X 












multilingual 
service name 


0101 1101 








X 






X 


network name 


0100 0000 




X 












NVOD reference 


0100 1011 








X 






X 


parental rating 


0101 0101 










X 




X 


partial transport 
stream 


0110 0011 














X 


PDC 


0110 1001 










X 






private data 
specifier 


0101 1111 


X 


X 


X 


X 


X 




X 



Notes : 

1. PMT: MPEG-2 Program Map Table. 

2. SIT only present in partial transport streams. 



Table 17.3b. List of DVB SI Tables, Descriptors, and Descriptor Locations. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


SIT 


satellite delivery 
system 


0100 0011 




X 












S2 satellite 
delivery system 


0111 1001 




X 












scrambling 


0110 0101 


X 














service 


0100 1000 








X 






X 


service 

availability 


0111 0010 








X 








service identifier 


0111 0001 








X 








service list 


0100 0001 




X 


X 










service move 


0110 0000 


X 














short event 


0100 1101 










X 




X 


short smoothing 
buffer 


0110 0001 










X 




X 


stream identifier 


0101 0010 


X 














stuffing 


0100 0010 




X 


X 


X 


X 




X 


subtitiing 


0101 1001 


X 














telephone 


01010111 








X 


X 




X 


teletext 


0101 0110 


X 














terrestrial 
delivery system 


0101 1010 




X 












time-shifted event 


0100 1111 










X 




X 


time-shifted 

service 


0100 1100 








X 






X 


transport stream 


0110 0111 
















VBI data 


0100 0101 


X 














VBI teletext 


0100 0110 


X 















Notes : 

1. PMT: MPEG-2 Program Map Table. 

2. SIT only present in partial transport streams. 



Table 17.3c. List of DVB SI Tables, Descriptors, and Descriptor Locations. 
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Running Status Table (RST) 

The RST updates the running status of one 
or more events. These are sent out only once, 
at the time of an event status change, unlike 
other tables which are usually transmitted 
repeatedly. 

Selection I nf ormation Table (SIT) 

The SIT describes services and events car- 
ried by a partial transport stream. The use of 
this table is restricted to partial transport 
streams; they are not used in broadcasts. 

Stuffing Table (ST) 

The ST is used to replace or invalidate sub- 
tables or complete SI tables. 

Time Offset Table (TOT) 

The TOT is the same as the TDT, except 
that it includes local time offset information. 

Descriptors 

Much like MPEG-2, DVB uses descriptors 
to add new functionality. In addition to various 
MPEG-2 descriptors, one or more of these 
DVB-specific descriptors may be included 
within the PMT or one or more SI tables (see 
Table 17.3) to extend data within the tables. A 
descriptor not recognized by a decoder must 
be ignored by that decoder. This enables new 
descriptors to be implemented without affect- 
ing receivers that cannot recognize and pro- 
cess the descriptors. 

AAC Descriptor 

This DVB descriptor indicates MPEG-4 
AAC, HE-AAC, or HE-AAC v2 audio is present, 
and is discussed in Chapter 13. 



AC-3 and Enhanced AC-3 Descriptors 

These DVB descriptors indicate Dolby® 
Digital or Dolby® Digital Plus audio is present, 
and are discussed in Chapter 13. 

Adaptation Field Data Descriptor 

This DVB descriptor, discussed in Chapter 
13, indicates the type of data field within the 
private data field of the MPEG-2 adaptation 
field. 

Ancillary Data Descriptor 

The DVB descriptor, discussed in Chapter 
13, indicates the presence and type of ancillary 
data in MPEG audio elementary streams. 

Announcement Support Descriptor 

This DVB descriptor identifies the type of 
announcement that is supported by the ser- 
vice. It also indicates the announcement trans- 
port method and gives linkage information so 
that the announcement stream can be moni- 
tored. 

Bouquet Name Descriptor 

This DVB descriptor provides the bouquet 
name as variable-length text, such as “Max 
Movie Channels.” 

CA Identifier Descriptor 

This DVB descriptor indicates whether a 
bouquet, service, or event is associated with a 
conditional access system and if so, identifies 
conditional access used. 

Cable Delivery System Descriptor 

This DVB descriptor is used to transmit 
the physical parameters of the cable network, 
including frequency, modulation, and symbol 
rate. 
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Cell Frequency link Descriptor 

This DVB descriptor is used in the NIT to 
describe the terrestrial network. It provides 
links between a cell and the frequencies that 
are used in the cell for the transport stream. 

Cell list Descriptor 

This DVB descriptor provides a list of all 
network cells about which the NIT informs and 
describes their coverage areas. 

Component Descriptor 

This DVB descriptor, discussed in Chapter 
13, indicates the type of stream and may be 
used to provide a text description of the 
stream. 

Content Descriptor 

This DVB descriptor is used to identify the 
type of content (comedy, talk show, etc.) . 

Country Availability Descriptor 

This DVB descriptor, discussed in Chapter 
13, identifies countries that are either allowed 
or not allowed to receive the service. The 
descriptor may appear twice for each service, 
once for listing countries allowed to receive 
the service, and a second time for listing coun- 
tries not allowed to receive the service. The lat- 
ter list overrides the former list. 

Data Broadcast Descriptor 

This DVB descriptor identifies within the 
SI available data broadcast services. 

Data Broadcast ID Descriptor 

This DVB descriptor, discussed in Chapter 
13, identifies the data coding system standard. 
It is a short form of the Data Broadcast 
Descriptor and may be present the PMT. 



DSNG Descriptor 

This DVB descriptor is present only in 
DSNG (Digital Satellite News Gathering) 
transmissions. 

DTS Audio Descriptor 

The DVB descriptor, discussed in Chapter 
13, indicates the presence DTS® audio elemen- 
tary streams. 

Extended Event Descriptor 

This DVB descriptor provides a text 
description of an event, which may be used in 
addition to the Short Event Descriptor. More 
than one descriptor can be used to convey 
more than 256 bytes of information. 

Extension Descriptor 

The DVB descriptor, discussed in Chapter 
13, extends the 8-bit value of the descriptor Jag 
field. 

Frequency List Descriptor 

This DVB descriptor may be present in the 
NIT. It conveys the additional frequencies 
when content is transmitted on other frequen- 
cies. 

linkage Descriptor 

This DVB descriptor provides a link to 
another service, transport stream, program 
guide, service information, software upgrade, 
etc. 

Local Time Offset Descriptor 

This DVB descriptor may be present in the 
TOT to describe country-specific dynamic 
changes of the local time offset relative to 
UTC. This enables a receiver to automatically 
adjust between summer and winter times. 
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Mosaic Descriptor 

This DVB descriptor, discussed in Chapter 
13, partitions a digital video component into 
elementary cells, controls the allocation of ele- 
mentary cells to logical cells, and links the con- 
tent of the logical cell and the corresponding 
information (e.g., bouquet, service, event, 
etc.) . 

Multilingual Bouquet Name Descriptor 

This DVB descriptor provides a bouquet 
name in text form in one or more languages. 

Multilingual Component Descriptor 

This DVB descriptor provides a compo- 
nent name in text form in one or more lan- 
guages. The component is identified by its 
component tag value. 

Multilingual Network Name Descriptor 

This DVB descriptor provides a network 
name in text form in one or more languages. 

Multilingual Service Name Descriptor 

This DVB descriptor provides service pro- 
vider names and offered services in text form 
in one or more languages. 

Network Name Descriptor 

This DVB descriptor conveys the network 
name in text form, such as “Munich Cable.” 

NVOD (Near Video On Demand) 
Reference Descriptor 

This DVB descriptor, in conjunction with 
the Time Shifted Service Descriptor and the 
Time Shifted Event Descriptor, provides an effi- 
cient way of describing a number of services 
which carry the same sequence of events, but 
with the start times offset from one another. 



Parental Rating Descriptor 

This DVB descriptor, discussed in Chapter 
13, gives a rating based on age and offers 
extensions to be able to use other rating crite- 
ria. 

Partial Transport Stream Descriptor 

The SIT contains all the information 
needed to control, play, and copy partial trans- 
port streams. This DVB descriptor describes 
this information. 

PDC Descriptor 

This DVB descriptor extends the DVB sys- 
tem with the functionality of PDC (Program 
Delivery Control) , defined by ETSI EN 300 231 
and ITU-R BT.809, and discussed in Chapter 8. 

Private Data Specifier Descriptor 

This DVB descriptor, discussed in Chapter 
13, is used to identify the source of any private 
descriptors or private fields within descriptors. 

Satellite Delivery System Descriptor 

This DVB descriptor is used to transmit 
the physical parameters of the satellite net- 
work, including frequency, orbital position, 
west-east flag, polarization, modulation, and 
symbol rate. 

Scrambling Descriptor 

This DVB descriptor, discussed in Chapter 
13, indicates the selected mode of operation for 
the scrambling system. 

Service Descriptor 

This DVB descriptor provides the name of 
the service and the service provider in text 
form. 
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Service Availability Descriptor 

This DVB descriptor is present in the SDT 
of a terrestrial network. It indicates whether or 
not service is available for the identified cells. 

Service List Descriptor 

This DVB descriptor provides a list of the 
services and service types for each transport 
stream. 

Service Move Descriptor 

This DVB descriptor, discussed in Chapter 
13, provides a way for a receiver to follow a ser- 
vice as it moves from one transport stream to 
another. Some disturbance in the video and 
audio will occur at such a transition. 

Short Event Descriptor 

This DVB descriptor provides the name 
and a short description of an event. 

Short Smoothing Buffer Descriptor 

This MPEG-2 descriptor enables the bit- 
rate of a service to be indicated in the PSI. 

Stream Identifier Descriptor 

This DVB descriptor, discussed in Chapter 
13, enables streams to be associated with a 
description in the EIT, useful when there is 
more than one stream of the same type within 
a service. 

Stuffing Descriptor 

This DVB descriptor is used to stuff tables 
for any reason or to disable descriptors that 
are no longer valid. 

Subtitling Descriptor 

This DVB descriptor, discussed in Chapter 
13, is used to identify ETSI EN 300 743 subtitle 
data. 



Telephone Descriptor 

This DVB descriptor indicates a telephone 
number, which may be used in conjunction 
with a PSTN or cable modem, to support nar- 
rowband interactive channels. 

Teletext Descriptor 

This DVB descriptor, discussed in Chapter 
13, is used to identify elementary streams 
which carry EBU Teletext data. 

Terrestrial Delivery System Descriptor 

This DVB descriptor is used to transmit 
the physical parameters of the terrestrial net- 
work, including center frequency, bandwidth, 
constellation, hierarchy information, code rate, 
guard interval, and transmission mode. 

Time-Shifted Event Descriptor 

This DVB descriptor indicates that an 
event is the time-shifted copy of another event. 

Time-Shifted Service Descriptor 

This DVB descriptor links one service with 
up to 20 other services carrying the same pro- 
gramming, but time-shifted. A typical applica- 
tion is for Near Video On Demand (NVOD) 
services. 

Transport Stream Descriptor 

This DVB descriptor, transmitted only in 
the TSDT, is used to indicate that the MPEG 
transport stream is DVB or DSNG compliant. 

VBI Data Descriptor 

This DVB descriptor, discussed in Chapter 
13, defines the VBI service type in the associ- 
ated elementary stream. 
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VBI Teletext Descriptor 

The syntax for this descriptor is the same 
as for Teletext Descriptor. The only difference is 
that it is not used to associate stream Jype = 
0x06 with either the VBI or EBU teletext stan- 
dard. Decoders use the languages in this 
descriptor to select magazines and subtitles. 



Data Broadcasting 

The DVB data broadcast standard 
describes the available encapsulation protocols 
used to transport data within a DVB stream. 
Based on MPEG-2 DSM-CC, it specifies DVB 
data piping, DVB data streaming, DVB multi- 
protocol encapsulation, DVB data carousels 
and DVB object carousels. DVB has added 
specific information to get the DSM-CC frame- 
work working in the DVB environment, partic- 
ularly with the DVB SI. 

Five different application areas with differ- 
ent broadcast requirements have been identi- 
fied. For each application area, a profile is 
defined and additional descriptors are used to 
support the application. Additional data broad- 
casting information is available in Chapter 13. 

Application Areas and Profiles 

Data Piping 

This profile supports data broadcast ser- 
vices that use a simple, asynchronous, end-to- 
end delivery of data. Data is carried directly in 
the payloads of MPEG-2 transport stream 
packets. 

Data Streaming 

This profile supports data broadcast ser- 
vices that use a streaming-oriented, end-to-end 



delivery of data in either an asynchronous, 
synchronous, or synchronized way. Data is car- 
ried in MPEG-2 PES packets. 

Multiprotocol Encapsulation 

This profile supports data broadcast ser- 
vices that use the transmission of datagrams of 
communication protocols, such as IP multicast- 
ing. The transmission of datagrams is done by 
encapsulating the datagrams in MPEG-2 DSM- 
CC sections. 

Data Carousels 

This profile supports data broadcast ser- 
vices that use the periodic transmission of data 
modules. These modules are of known sizes 
and may be updated, added to, or removed 
from the data carousel in time. Data is broad- 
cast using an MPEG-2 DSM-CC Data Carousel. 

Object Carousels 

This profile supports data broadcast ser- 
vices that use the periodic broadcasting of 
DSM-CC User-User (U-U) Objects. Data is 
broadcast using the MPEG-2 DSM-CC Object 
Carousel and DSM-CC Data Carousel. 



Conditional Access 

Conditional Access (CA) is the encryption 
of the content prior to transmission so that 
only authorized users may enjoy it. In order to 
decrypt the protected content, the receiver 
uses a CA module. The CA module enables 
decrypting only those programs that have 
been authorized. 

There are two basic techniques to imple- 
ment DVB conditional access: Simulcrypt and 
Multicrypt. 
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Simulcrypt 

Simulcrypt relies on the DVB Common 
Scrambling Algorithm (CSA), a tool for the 
secure scrambling and descrambling of trans- 
port streams or program elementary streams. 
The CSA descrambling circuitry is embedded 
inside the video decompression chip rather 
than inside the CA module. The CA module 
does not process the encrypted data directly; it 
simply provides decryption key information to 
the CSA descrambling circuitry so the stream 
can be decrypted. 

Since Simulcrypt-based receivers do not 
need to use the DVB Common Interface, the 
CA module is either embedded inside the 
receiver or a detachable CA module is used 
based on the NRSS-A interface. An embedded 
CA module improves security since access to it 
is more difficult; however, the receiver is not 
easily upgraded or adapted for use with other 
CA systems and therefore may quickly 
become obsolete. 

Using a detachable CA module based the 
EIA-679 NRSS-A interface (ISO 7816 form fac- 
tor) allows changing CA systems as easily as 
changing a low-cost CA card. CA cards are sup- 
plied by the service providers to each sub- 
scriber. 

Simulcrypt also enables the use of more 
than one CA system in a broadcast or receiver. 
Each CA system’s ECMs (Entitlement Control 
Messages) and EMMs (Entitlement Manage- 
ment Messages) are transmitted in the stream. 
Receivers recognize and use the appropriate 
ECM and EMM needed for decrypting. Thus, 
a broadcast containing data for multiple CA 
systems can be viewed on receivers that sup- 
port any of these CA systems. It also enables a 
new CA solution to be deployed while main- 
taining compatibility with a legacy CA system. 

A second generation version of DVB-CSA 
(called DVB-CSA2) is being investigated. 



Multicrypt 

Multicrypt is an open system based on a 
detachable CA module, supplied by the service 
provider to each subscriber. Encrypted 
streams are sent to the CA module. The CA 
module finds and extracts needed data, such as 
ECM and EMM, directly from the streams. 
Decrypted streams are then output to the 
MPEG-2 decoder. 

The CA module is plugged into the 
receiver via the DVB Common Interface. Multi- 
crypt’s advantage is that a receiver can be eas- 
ily configured to receive services from 
different service providers using different and 
incompatible CA systems. As a result, the 
receiver is less likely to become obsolete. 

A receiver can support multiple CA sys- 
tems by using multiple DVB Common Inter- 
faces to support multiple CA modules. 
Encrypted streams are passed sequentially 
through the different CA modules, with each 
of them extracting from the stream its own 
ECMs and EMMs. 

DVB Common Interface 

The DVB Common Interface provides a 
physical separation between receiver and CA 
functions. Based on the EIA-679 NRSS-B inter- 
face (PCMCIA or PC Card form factor), it is 
the key to the Multicrypt system. 

The transport stream interface consists of 
an 8-bit parallel input, 8-bit parallel output, con- 
trol signals, and clocks. The command (host) 
interface consists of an 8-bit bi-directional data 
bus, address, and control signals. 

The interface can also be used to add new 
features to a receiver, such as supporting a 
new audio codec or adding visually impaired 
audio capabilities. 

A second generation version of DVB-CI 
(called DVB-CI2) is being investigated. 
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Application Block Diagrams 

Figures 17.2 and 17.3 illustrate a typical 
DYB-S set-top box. 
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Figure 17.2. DVB Receiver Set-Top Box Block Diagram (Multicrypt). 




Figure 17.3. DVB Receiver Set-Top Box Block Diagram (Simulcrypt) 
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ISDB 

Digital Television 



The ISDB (Integrated Services Digital 
Broadcasting) digital television (DTV) broad- 
cast standard is used in Japan. 

The three other primary DTV standards 
are ATSC (Advanced Television Systems Com- 
mittee), DVB (Digital Video Broadcast), and 
OpenCable . The basic audio and video capa- 
bilities are very similar. The major differences 
are the RF modulation schemes and the level 
of definition for non-audio/video services. 
ISDB builds on DVB, adding additional ser- 
vices required for Japan. A comparison of the 
ISDB standards is shown in Table 18.1. 

The ISDB standard is actually a group of 
ARIB standards: 

STD-BIO Service Information For Digital 
Broadcasting System 

STD-B16 Standard Digital Receiver Com- 
monly Used For Digital Satellite 
Broadcasting Services Using Com- 
munication Satellite 



STD-B20 


ISDB-S: Transmission System For 
Digital Satellite Broadcasting 


STD-B21 


Receiver For Digital Broadcasting 
(Desirable Specifications) 


STD-B23 


Application Execution Engine Plat- 
form for Digital Broadcasting 


STD-B24 


Data Coding And Transmission 
Specification For Digital Broadcast- 
ing 


STD-B25 


Conditional Access System Specifi- 
cations for Digital Broadcasting 


STD-B31 


ISDB-T: Transmission System For 
Digital Terrestrial Television Broad- 
casting 


STD-B32 


Video Coding, Audio Coding And 
Multiplexing Specifications For Dig- 
ital Broadcasting 


STD-B40 


PES Packet Transport Mechanism 
for Ancillary Data 
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ISDB uses an MPEG-2 transport stream to 
convey compressed digital video, compressed 
digital audio, and data. Like DVB, this trans- 
port stream is then transmitted either via ter- 
restrial, cable, or satellite. Interactive 
applications are based on BML (Broadcast 
Mark-up Language) . 



ISDB-S (Satellite) 

Two satellite standards exist, ISDB-S, also 
known as the BS (broadcast satellite) system, 
and DVB-S, also known as the CS (communica- 
tions satellite) system. 

ISDB-S (BS) is also specified by ITU-R 
BO. 1408. It has a maximum bit-rate of -52.2 
Mbps using TC8PSK modulation and a 34.5 
MHz transponder. 



CS supports only one transport stream per 
transport channel, and supports up to -34 
Mbps on a 27 MHz channel. The modulation 
scheme is QPSK, just like DVB-S. Unlike the 
other variations of ISDB, CS uses MPEG-2 
MP@ML video (480i or 480p) and MPEG-2 BC 
audio. 



ISDB-C (Cable) 

ISDB-C uses 64-QAM modulation, with 
two versions: one that supports only a single 
transport stream per transmission channel and 
one that supports multiple transport streams 
per transmission channel. On a 6 MHz chan- 
nel, ISDB-C can transmit up to -29.16 Mbps. 
As the bit-rate on an ISDB-S satellite channel is 
2x that, two cable channels can be used to 
rebroadcast satellite information. 



Parameter 


ISDB-T 


ISDB-C 


ISDB-S 


(Terrestrial) 


(Cable) 


(Satellite) 


video compression 


MPEG-2, MPEG-4.10 (H.264) 


audio compression 


MPEG-2 AAC, MPEG-4 HE-AAC 


multiplexing 


MPEG-2 transport stream 


modulation 


BST-OFDM 1 


QAM 


PSK 


channel bandwidth 


6, 7, or 8 MHz 


6, 7, or 8 MHz 


- 



Note : 

1. BST-OFDM = Bandwidth Segmented Transmission of OFDM. 

Table 18.1. Comparison of ISDB Standards. 
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ISDB-C also allows passing through 
OFDM-based ISDB-T signals since the chan- 
nel bandwidths are the same. 

ISDB-S (BS) signals can also be passed 
through by downconverting the satellite sig- 
nals at the cable head-end, then up-converting 
them at the receiver. This technique is suitable 
only for cable systems that have many (up to 
29) unused channels. 



ISDB-T (Terrestrial) 

ISDB-T, the terrestrial broadcast standard, 
is also specified by ITU-R BT.1306. It has a 
maximum bit-rate of -23.2 Mbps using 5.6 
MHz of bandwidth. ISDB-T also supports 
bandwidths of 6, 7, and 8 MHz. 

The bandwidth is divided into 13 OFDM 
segments; each segment can be divided in up 
to three segment groups (hierarchical layers) 
having different transmission parameters such 
as the carrier modulation scheme, inner-code 
coding rate, and time interleaving length. This 
enables the same program to be broadcast in 
different resolutions, allowing a mobile 
receiver to show a standard-definition picture 
while a stationary receiver shows a high-defini- 
tion picture. 



Video Capability 

There are several standardized resolu- 
tions, indicated in Table 18.2. 

Primary video compression is based on 
MPEG-2 MP@ML or MP@HL. However, there 
are some minor constraints on some of the 
MPEG-2 parameters, as discussed within the 
MPEG-2 chapter. 



MPEG-4.2 Simple Profile or Core Profile 
video is also supported, using resolutions of 
176 x 144 (64 or 384 kbps) or 325 x 288 (128, 
384, or 2000 kbps) . 

MPEG-4.10 (H.264) Baseline Profile or 
Main Profile video is also supported, using res- 
olutions of 176 x 144 (64 kbps) or 325 x 288 
(192, 384, 768, 2000, or 4000 kbps). 



Audio Capability 

Primary audio compression is imple- 
mented using MPEG-2 AAC-LC with up to 5.1 
channels. ISDB also supports MPEG-4 HE- 
AAC audio. 



Still Picture Capability 

Still pictures are supported using JPEG 
(ISO/IEC 10918-1), PNG (Portable Network 
Graphics), MNG (Multiple-image Network 
Graphics), MPEG-2 I-frame, MPEG-4.2 I-VOP, 
and MPEG-4.10 (H.264) I-picture formats. 



Graphics Capability 

Graphics commands include Domain, Tex- 
ture (fill, vertical hatch, horizontal hatch, cross 
hatch), Set Color (foreground, background), 
Select Color, Blink, Set Pattern, Point, Line 
(solid, dotted, broken, dotted-broken), Arc 
(outlined, filled), Rectangle (outlined, filled), 
and Polygon (outlined, filled) . 

Figure 18.1 illustrates the 5-plane video/ 
graphics architecture used by ISDB. 
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Active Resolution 
(Y) 


SDTV or HDTV 


Frame Rate 

(p = progressive, i = interlaced) 


M PEG-2 


MPEG-4.2 


MPEG-4.10 

(H.264) 


23.976p 

24p 


29.97i 

30i 


29.97p 

30p 


59.94p 

60p 


176 x 120 


sim 


X 




X 


X 


X 






176 x 144 


X 




X 


X 


X 


X 


X 


352 x 240 


X 




X 


X 


X 






352 x 288 


X 




X 


X 


X 


X 


X 


352 x 480 




X 






X 






480 x 480 




X 






X 






544 x 480 




X 






X 






720 x 480 




X 






X 






1280 x 720 


HDTV 


X 




X 


X 


X 






1440 x 1080 


X 


X 


X 




X 






1920 x 1080 


X 


X 


X 




X 







Table 18.2. Common Active Resolutions for ISDB Digital Television. 



BACK 

GROUND 




Figure 18.1. ISDB 5-Plane Video/Graphics Standard. 
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System Information (SI) 

ARIB STD-BIO specifies the Service Infor- 
mation (SI) data which forms a part of ISDB 
bitstreams. SI is a small collection of hierarchi- 
cally associated tables (see Table 18.3) 
designed to extend the MPEG-2 PSI tables. It 
provides information on what is available on 
other transport streams and even other net- 
works. The method of information presenta- 
tion to the user is not specified, allowing 
receiver manufacturers to choose appropriate 
presentation methods. 

Tables 

Application I nf ormation Table (AIT) 

The AIT transmits dynamic control infor- 
mation concerning ARIB-J application and 
information for execution. 

Bouquet Association Table (BAT) 

The BAT provides information regarding 
bouquets (groups of services that may traverse 
the network boundary). Along with the name 
of the bouquet, it provides a list of services for 
each bouquet. 

Broadcaster Information Table (BIT) 

The BIT is used to submit broadcaster 
information on network. 

Common Data Table (CDT) 

The CDT transmits data which is required 
for all receivers and is to be stored in non-vola- 
tile memory. 

Discontinuity I nf ormation Table (DIT) 

The DIT is present at transition points 
where the SI information is discontinuous. The 



use of this table is restricted to partial trans- 
port streams; they are not used in broadcasts. 

Download Table (DLT) 

The DLT is used to transmit software for 
downloading. 

Download Control Table (DCT) 

The DCT is used to transmit of information 
to indicate how to process the DLT. 

Event I nf ormation Table (EIT) 

There are up to 128 EITs, EIT-0 through 
EIT-127, each of which describes the events or 
TV programs associated with each channel. 
Each EIT is valid for 3 hours. Since there are 
up to 128 EITs, up to 16 days of programming 
may be advertised in advance. The first four 
EITs are required (the first 24 are recom- 
mended) to be present. 

Information provided by the IET includes 
start time, duration, title, pointer to descriptive 
text for the event, advisory data, caption ser- 
vice data, audio service descriptor, etc. 

Event Relation Table (ERT) 

The ERT indicates relationships between 
programs or events and their attributes. 

Index Transmission Table (ITT) 

The ITT is used to convey program index 
information with a program. 

linked Description Table (LDT) 

The LDT is used to link various descrip- 
tions from other tables. 

Local Event Information Table (LIT) 

The LIT conveys information in a program 
that relates to a local event, such as time, 
name, and explanation of the local event. 
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Network Board I nf ormation Table (NBIT) 

The NBIT transmits board information on 
the network, e.g., a guide. 

Network I nf ormation Table (NIT) 

The NIT provides information about the 
physical network, including any grouping of 
transport streams and the relevant tuning 
information. It can be used during receiver set- 
up and the relevant tuning information stored 
in non-volatile memory. The NIT can also be 
used to signal changes of tuning information. 

Partial Content Announcement Table 
(PCAT) 

The PCAT conveys partial content 
announcement for data broadcasting. 

Running Status Table (RST) 

The RST updates the running status of one 
or more events. These are sent out only once, 
at the time of an event status change, unlike 
other tables which are usually transmitted 
repeatedly. 

Selection I nf ormation Table (SIT) 

The SIT describes services and events car- 
ried by a partial transport stream. The use of 
this table is restricted to partial transport 
streams, they are not used in broadcasts. 

Service Description Table (SDT) 

The SDT conveys information related to 
the channel, such as channel name and broad- 
casting company name. 

Software Download Trigger Table (SDTT) 

The SDTT conveys notification informa- 
tion, such as download service ID, schedule 
information, and receiver types for revision. 



Stuffing Table (ST) 

The ST is used to replace or invalidate sub- 
tables or complete SI tables. 

Time and Date Table (TDT) 

The TDT contains the actual UTC-time- 
coded as Modified Julian Date (MJD). Receiv- 
ers can use it to maintain the correct local 
time. 

Time Offset Table (TOT) 

The TOT is the same as the TDT, except it 
includes local time offset information. 

Descriptors 

Much like MPEG-2, ISDB uses descriptors 
to add new functionality. In addition to various 
MPEG-2 descriptors, one or more of these 
ISDB-specific descriptors may be included 
within the PMT or one or more SI tables (see 
Table 18.3) to extend data within the tables. A 
descriptor not recognized by a decoder must 
be ignored by that decoder. This enables new 
descriptors to be implemented without affect- 
ing receivers that cannot recognize and pro- 
cess the descriptors. 

Audio Component Descriptor 

This ARIB descriptor indicates the parame- 
ters of an audio elementary stream. 

AVC Timing and HRD Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, describes the video stream time 
information and the reference decoder infor- 
mation for H.264 (MPEG-4.10). 

AVC Video Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, describes basic coding parameters 
of the H.264 (MPEG-4.10) video stream. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


BIT 


NBIT 


LDT 


PID 


per 

PAT 


0x0010 


0x0011 


0x0011 


0x0012 


0x0014 


0x0024 


0x0025 


0x0025 


TableJD 


0x02 


0x40 

0x41 


0x4A 


0x42 

0x46 


0x4E- 

0x6F 


0x73 


QxC4 


0xC5 

0xC6 


0xC7 


repetition rate 


100 

ms 


10 

sec 


10 

sec 


2-10 

sec 


2-10 

sec 


30 

sec 


20 

sec 


20 

sec 


20 

sec 


audio component 


1100 0100 










X 










AVC timing and HRD 


0010 1010 


X 


















AVC video 


0010 1000 


X 


















basic local event 


1101 0000 




















board information 


1101 1011 
















X 




bouquet name 


0100 0111 






X 


X 












broadcaster name 


1101 1000 














X 






CA contract 
information 


1100 1011 




















CA EMM TS 


1100 1010 




















CA identifier 


0101 0011 






X 


X 


X 










CA service 


1100 1100 




















cable TS division 
system 


1111 1001 




















cable distribution 
system 


0100 0100 




















carousel compatible 
composite descriptor 


11110111 


X 








X 










component 


0101 0000 


X 








X 










component group 


1101 1001 










X 










conditional playback 


1111 1000 


X 


















connected 

transmission 


1101 1101 





















Note : 

1. PMT: MPEG-2 Program Map Table. 



Table 18.3a. List of ISDB SI Tables, Descriptors, and Descriptor Locations. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


BIT 


NBIT 


LDT 


content 


0101 0100 










X 










content availability 


1101 1110 


X 




X 


X 












country availability 


0100 1001 


X 




X 


X 












data component 


1111 1101 


X 


















data content 


1100 0111 










X 










digital copy control 


1100 0001 


X 






X 


X 










download content 


1100 1001 




















emergency information 


1111 1100 


X 


X 






X 










event group 


11010110 










X 










extended broadcaster 


1100 1110 














X 






extended event 


0100 1110 










X 








X 


hierarchical 

transmission 


1100 0000 


X 


















hyperlink 


1100 0101 










X 










LDT linkage 


1101 1100 










X 










linkage 


0100 1010 


X 


X 


X 


X 


X 










local time offset 


0101 1000 












X 








logo transmission 


11001111 








X 












mosaic 


0101 0001 


X 






X 












network identification 


1100 0010 




















network name 


0100 0000 




X 
















node relation 


1101 0010 




















NVOD reference 


0100 1011 








X 












parental rating 


0101 0101 


X 








X 










partial reception 


1111 1011 




X 

















Note: 

1. PMT: MPEG-2 Program Map Table. 



Table 18.3b. List of ISDB SI Tables, Descriptors, and Descriptor Locations. 
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Descriptor 


Descriptor 

Tag 


Tables 


PMT 


NIT 


BAT 


SDT 


EIT 


TOT 


BIT 


NBIT 


LDT 


partial transport 
stream 


0110 0011 




















partial transport 
stream time 


1100 0011 




















reference 


1101 0001 




















satellite delivery 
system 


0100 0011 




X 
















series 


11010101 










X 










service 


0100 1000 








X 












service list 


0100 0001 




X 


X 








X 






short event 


0100 1101 










X 








X 


short node 
information 


11010011 




















SI parameter 


11010111 














X 






SI Prime_TS 


1101 1010 














X 






STC reference 


1101 0100 




















stream identifier 


0101 0010 


X 


















stuffing 


0100 0010 




X 


X 


X 


X 






X 


X 


system 

management 


1111 1110 


X 


X 
















target region 


1100 0110 


X 


















terrestrial 
delivery system 


1111 1010 




X 
















time-shifted event 


0100 1111 










X 










time-shifted service 


0100 1100 








X 












TS information 


1100 1101 




















video decode control 


1100 1000 


X 



















Note: 

1. PMT: MPEG-2 Program Map Table. 



Table 18.3c. List of ISDB SI Tables, Descriptors, and Descriptor Locations. 
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Basic Local Event Descriptor 

This ARIB descriptor indicates the local 
event identifier information. 

Board Information Descriptor 

This ARIB descriptor indicates the title and 
content of the board information in text format. 

Bouquet Name Descriptor 

This ARIB descriptor provides the bouquet 
name as variable-length text, such as “Max 
Movie Channels.” DVB also uses this descrip- 
tor. 

Broadcaster Name Descriptor 

The ARIB descriptor indicates the name of 
the broadcaster. 

CA Contract I nf ormation Descriptor 

This ARIB descriptor describes the condi- 
tional access service type for the scheduled 
program. 

CA EMM TS Descriptor 

This ARIB descriptor indicates the special 
trap-on when the EMM transmission is made 
by the special trap-on method. 

CA Identifier Descriptor 

This ARIB descriptor indicates whether a 
bouquet, service, or event is associated with a 
conditional access system and if so, identifies 
the conditional access used. DVB also uses 
this descriptor. 

CA Service Descriptor 

This ARIB descriptor conveys the broad- 
cast service provider servicing the automatic 
message indication. 



Carousel Compatible Composite 
Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, uses descriptors defined in the 
data carousel transmission specification (ARIB 
STD-B24 Part 3) as sub-descriptors, and 
describes accumulation control by applying 
the functions of the sub-descriptors. 

Component Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, indicates the type of stream and 
may be used to provide a text description of 
the stream. DVB also uses this descriptor. 

Component Group Descriptor 

This ARIB descriptor defines and identifies 
component grouping in an event. 

Conditional Playback Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, conveys the description of condi- 
tional playback and the PID that transmits the 
ECM and EMM. 

Connected Transmission Descriptor 

This ARIB descriptor indicates the physi- 
cal condition when connected to a transmis- 
sion in the terrestrial audio transmission path. 

Content Descriptor 

This ARIB descriptor is used to identify the 
type of content (comedy, talk show, etc.) . DVB 
also uses this descriptor. 

Content Availability Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, describes information to control 
the recording and output of content by receiv- 
ers. The encryption jnode flag indicates 
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whether or not to encrypt the digital video out- 
puts. It is used in combination with the Digital 
Copy Control Descriptor. 

Country Availability Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, identifies countries that are either 
allowed or not allowed to receive the service. 
The descriptor may appear twice for each ser- 
vice, once for listing countries allowed to 
receive the service, and a second time for list- 
ing countries not allowed to receive the ser- 
vice. The latter list overrides the former list. 
DVB also uses this descriptor. 

Data Component Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, identifies data components. 

Data Content Descriptor 

This ARIB descriptor describes the 
detailed information relating to individual con- 
tents of a data broadcasting event. 

Digital Copy Control Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, signals copy generation informa- 
tion, including copy-free, copy-one-generation, 
and copy-never. 

For content which is either copy-restricted 
by digital _recording jcontrol_data in the Digital 
Copy Control Descriptor, or copy-protected by 
encryption jmode in the Content Availability 
Descriptor, receivers are prohibited from trans- 
ferring the content to any output that poten- 
tially allows redistribution of it over the 
Internet. 

Download Content Descriptor 

This ARIB descriptor conveys download 
attribute information such as size, type, and 
ID. 



Emergency Information Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, is used to broadcast an emergency 
message. 

Event Group Descriptor 

This ARIB descriptor, when there is a rela- 
tionship between multiple events, indicates 
that these events are in a group. 

Extended Broadcaster Descriptor 

This ARIB descriptor specifies the 
extended broadcaster identification informa- 
tion and defines the relationships with other 
extended broadcasters and broadcasters of 
other networks. 

Extended Event Descriptor 

This ARIB descriptor provides a text 
description of an event, which may be used in 
addition to the Short Event Descriptor. More 
than one descriptor can be used to convey 
more than 256 bytes of information. DVB also 
uses this descriptor. 

Hierarchical Transmission Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, indicates the relationship between 
hierarchical streams. 

Hyperlink Descriptor 

This ARIB descriptor describes the link- 
age to other events, event contents, and infor- 
mation events. 

LDT linkage Descriptor 

This ARIB descriptor describes the link- 
age of the information collected in the LDT. 
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Linkage Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, provides a link to another service, 
transport stream, program guide, service 
information, software upgrade, etc. DVB also 
uses this descriptor. 

Local Time Offset Descriptor 

This ARIB descriptor may be present in 
the TOT to describe country-specific dynamic 
changes of the local time offset relative to 
UTC. This enables a receiver to adjust auto- 
matically between summer and winter times. 
DVB also uses this descriptor. 

Logo Transmission Descriptor 

This ARIB descriptor describes service 
logo information, such as pointing to PNG logo 
data transmitted by ARIB STD-B21, logo iden- 
tifier, logo version, and the 8-unit code alpha- 
numeric character string for a simple logo. 

Mosaic Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, partitions a digital video compo- 
nent into elementary cells, the allocation of ele- 
mentary cells to logical cells, and links the 
content of the logical cell and the correspond- 
ing information (e.g., bouquet, service, event, 
etc.) . DVB also uses this descriptor. 

Network Identification Descriptor 

This ARIB descriptor identifies the net- 
work. 

Network Name Descriptor 

This ARIB descriptor conveys the network 
name in text form, such as ‘Tokyo Cable.” 
DVB also uses this descriptor. 



Node Relation Descriptor 

This ARIB descriptor describes the rela- 
tionship between two nodes. 

NVOD (Near Video On Demand) 
Reference Descriptor 

This ARIB descriptor, in conjunction with 
the Time-Shifted Service Descriptor and the 
Time-Shifted Event Descriptor, provides an effi- 
cient way of describing a number of services 
which carry the same sequence of events, but 
with the start times offset from one another. 
DVB also uses this descriptor. 

Parental Rating Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, gives a rating based on age and 
offers extensions to be able to use other rating 
criteria. DVB also uses this descriptor. 

Partial Reception Descriptor 

This ARIB descriptor indicates the 
service Jd transmitted by the partial reception 
hierarchy of the terrestrial transmission path. 

Partial Transport Stream Descriptor 

The SIT contains all the information 
needed to control, play, and copy partial trans- 
port streams. This ARIB descriptor describes 
this information. DVB also uses this descrip- 
tor. 

Partial Transport Stream Time Descriptor 

This ARIB descriptor describes partial 
transport stream time information. 

Reference Descriptor 

This ARIB descriptor indicates the node 
reference from programs and local events. 
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Satellite Delivery System Descriptor 

This ARIB descriptor conveys the physical 
parameters of the satellite network, including 
frequency, orbital position, west-east flag, 
polarization, modulation, and symbol rate. 
DVB also uses this descriptor. 

Series Descriptor 

This ARIB descriptor identifies a series 
event. 

Service Descriptor 

This ARIB descriptor provides the name of 
the service and the service provider in text 
form. DVB also uses this descriptor, although 
the servicejypejd types are different between 
ARIB and DVB. 

Service List Descriptor 

This ARIB descriptor provides a list of the 
services and service types for each transport 
stream. DVB also uses this descriptor. 

Short Event Descriptor 

This ARIB descriptor provides the name 
and a short description of an event. DVB also 
uses this descriptor. 

Short Node I nf ormation Descriptor 

This ARIB descriptor indicates the node 
name and simple explanation. 

SI Parameter Descriptor 

This ARIB descriptor indicates the SI 
parameter. 

SI Prime_TS Descriptor 

This ARIB descriptor indicates the identi- 
fier information of the SI prime TS and its 
transmission parameter. 



STC Reference Descriptor 

This ARIB descriptor indicates the rela- 
tionship between the identification time of the 
local event and the STC. 

Stream Identifier Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, enables streams to be associated 
with a description in the EIT, useful when 
there is more than one stream of the same type 
within a service. DVB also uses this descriptor. 

Stuffing Descriptor 

This ARIB descriptor is used to stuff tables 
for any reason or to disable descriptors that 
are no longer valid. DVB also uses this descrip- 
tor. 

System Management Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, identifies the broadcasting and 
non-broadcasting formats used. 

Target Region Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, describes the target region of an 
event or a part of the stream comprising an 
event. 

Terrestrial Delivery System Descriptor 

This ARIB descriptor is used to transmit 
the physical parameters of the terrestrial net- 
work, including center frequency, bandwidth, 
constellation, hierarchy information, code rate, 
guard interval, and transmission mode. 

Time-Shifted Event Descriptor 

This ARIB descriptor indicates that an 
event is the time-shifted copy of another event. 
DVB also uses this descriptor. 
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Time-Shifted Service Descriptor 

This ARIB descriptor links one service 
with up to 20 other services carrying the same 
programming, but time-shifted. A typical appli- 
cation is for Near Video On Demand (NVOD) 
services. DVB also uses this descriptor. 

TS I nf ormation Descriptor 

This ARIB descriptor specifies the remote 
control key identifier assigned to the applica- 
ble transport stream and indicates the relation- 
ship between the service identifier and the 
transmission layer during hierarchical trans- 
mission. 

Video Decode Control Descriptor 

This ARIB descriptor, also discussed in 
Chapter 13, controls the decoding of MPEG- 
based still pictures transmitted at low transmis- 
sion speed and to achieve smooth decoding at 
video splice points when changing video cod- 
ing method. 



Captioning 

Japanese captioning data (ARIB STD-B24 
Part 3) can be present in video PES, audio 
PES, or independent PES (preferred) . Caption- 
ing not related to video content is called 
“superimpose” (ARIB STD-B5) . 

Both the horizontal and vertical writing 
formats may be used. Supported character sets 
include Mosaic, Chinese, Kanji, Hiragana, 
Katakana, Symbol, and alpha-numeric. 
Attributes include reverse polarity, flash, 
underline, hem, shade, bold, italic, and bold- 
italic. Bitmap graphics are also supported. 

Display control includes display timing, 
erase timing, cut, dissolve, wipe, slide, and roll. 
It also supports flexible viewing, recording, 
and playback options. 



Data Broadcasting 

The ARIB data broadcast standard 
describes the available encapsulation protocols 
used to transport data within a ARIB stream. 
Based on MPEG-2 DSM-CC, it also supports 
an XML-based multimedia coding scheme. 

Five different data broadcast specifications 
have been identified. Most of the specifications 
have additional descriptors used to support the 
specification. 

Data Carousel Transmission 

This specification transmits general syn- 
chronous and asynchronous data, allowing a 
receiver to obtain data during its transmission 
period. Used for download and multimedia ser- 
vices. 

Data Piping 

If required, this specification may be used 
to deliver data to a receiver. Data is carried 
directly in the payloads of MPEG-2 transport 
stream packets. 

Event Message Transmission 

This specification is used for synchronous 
and asynchronous message notification (either 
immediately or at a specified time) to an appli- 
cation in the receiver. Used for multimedia ser- 
vices. 

Independent PES Transmission 

This specification supports data broadcast 
services that use a streaming-oriented delivery 
of data in either an asynchronous or synchro- 
nous way. Data is carried in MPEG-2 PES pack- 
ets. Also used for subtitles and superimposed 
characters. 
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Interaction Channel Protocols 

This specification provides the transmis- 
sion protocols used over public networks 
including PSTN, ISDN, and mobile networks 
for bi-directional interactive services. 



Application Block Diagrams 

Figure 18.2 illustrates a typical ISDB-S set- 
top box. 

References 

1. STD-BIO, Service Information for Digital 
Broadcasting System, version 3.8. 

2. STD-B16, Standard Digital Receiver Com- 
monly Used for Digital Satellite Broadcast- 
ing Services Using Communication 
Satellite. 



3. STD-B20, ISDB-S: Transmission System for 
Digital Satellite Broadcasting. 

4. STD-B21, Receiver for Digital Broadcasting 
(Desirable Specifications) , version 4.2. 

5. STD-B23, Application Execution Engine 
Platform for Digital Broadcasting. 

6. STD-B24, Data Coding and Transmission 
Specification for Digital Broadcasting, ver- 
sion 4.0. 

7. STD-B25, Conditional Access System Speci- 
fications for Digital Broadcasting. 

8. STD-B31, ISDB-T: Transmission System for 
Digital Terrestrial Television Broadcasting, 
version 1.5. 

9. STD-B32, Video Coding, Audio Coding and 
Multiplexing Specifications for Digital 
Broadcasting, version 1.5. 

10. STD-B40, PES Packet Transport Mecha- 
nism for Ancillary Data. 




PAL 

S-VIDEO 

YPBPR 

HDMI 



5.1-CHANNEL 

AUDIO 

S/PDIF 



Figure 18.2. ISDB Receiver Set-Top Box Block Diagram. 



Chapter 19 



IPTV 



With the increased use of digital video and 
high-speed broadband networks, transferring 
real-time audio and video over a broadband 
network has become popular. The technology 
is known by several names, including IPTV 
(Internet Protocol TV), streaming video, video 
over IP, and IP video. 

Rather than downloading and storing large 
audio and video files, then playing them back, 
data is sent across the network in streams. 
Streaming breaks the audio and video data into 
small packets suitable for transmission. The 
real-time audio and video data flows from a 
video server or real-time video encoder, 
through a network, and is decoded and played 
by the receiver (or “client”) in real time. Thus, 
the user can start viewing a video without wait- 
ing until the end of the download process. 

Telcos are adopting IPTV over DSL and 
FTTH as a way of offering video services to 
compete with cable and satellite TV. They are 
now able to offer VoIP (Voice over IP), video- 
on-demand (VOD), gaming, music, interactive 
television, and local, national, and premium 
television programming. 



Considerations 

Streaming video over a network is not a 
trivial task. First, even compressed video data 
requires relatively high bandwidth. Limiting 
the bit-rate to about 700 kbps is desirable to 
support streaming two standard-definition 
video streams over a single 1.5 Mbps DSL con- 
nection. For this reason, using the new H.264 
and SMPTE VC-1 video codecs is highly desir- 
able. The lower bit-rates achievable with H.264 
and VC-1 also enable a larger area to be ser- 
viced since DSL bit-rate decreases with dis- 
tance. 

Second, streaming video requires real-time 
transfers to avoid interruptions in the playback 
process. This requires the video servers and 
real-time encoders to be able to stream the 
video continuously and avoid network conges- 
tion. To address this issue, standards are avail- 
able to reserve bandwidth resources along the 
network. Multicasting is also being used to 
reduce network bandwidth requirements fur- 
ther. 
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Third, streaming video is usually bursty. 
Streaming video clients have a receive buffer 
of limited size. If measures are not taken to 
smooth the transmitted bit-rate, the receive 
buffer may overflow or underflow. To address 
this issue, additional protocols are used to 
manage the timing issues. 



Multicasting 

There are three common techniques of 
streaming real-time audio and video over a net- 
work: 

Unicast , where a server sends data to one 
receiver, as shown in Figure 19.1. The port 
number is chosen by the receiver. 

Broadcast, where data is sent from one server 
to all receivers, as illustrated in Figure 19.2. 

Multicast, where data is sent from one server 
to a group of receivers, as shown in Figure 
19.3. The server picks the multicast IP 
address and port. This is a typical case for 
live and near-video-on-demand (NVoD) appli- 
cations. 

The recent support for multicasting is a 
result of needing real-time distribution of large 
amounts of data, such as audio and video, com- 
bined with an increasing number of users. In 
this environment, multicasting is an excellent 
way to save network and server capacity. 



RTSP-Based Solutions 

With proprietary solutions, each video 
server vendor has its own unique streaming 
protocols and file formats, requiring a client to 
support multiple protocols or be tailored to a 



specific vendor. In an effort to develop an 
open, standards-based solution, the Internet 
Engineering Task Force (IETF) developed 
several protocols to enable cross-platform con- 
nectivity and communications between clients 
and servers. 

RTSP is the control protocol for initiating and 
directing delivery of streaming data from 
video servers, implementing a remote control 
capability. RTSP does not deliver the multi- 
media data, though the RTSP connection 
may be used to tunnel RTP data for ease of 
use with firewalls and other network devices. 

RTP is the transport protocol for the delivery 
of real-time data, including streaming audio 
and video. RTP and RTSP are usually used 
together, but either protocol can be used 
without the other. 

RTCP is a part of RTP and helps with lip syn- 
chronization and Quality-of-Service (QoS) 
management. 

RSVP is the protocol for establishing and 
maintaining desired QoS levels, ensuring 
adequate network resources (such as band- 
width) are available. 

RTSP 

The Real-Time Streaming Protocol (RTSP) 
establishes and controls one or more time-syn- 
chronized streams of audio and video data 
between a server (source) and client 
(receiver). The server provides playback or 
recording services for the streams while the 
client requests continuous data from the 
server. 

RTSP provides “VCR-style” control func- 
tionality for the audio and video streams, 
including play, pause, fast forward, and 
reverse. It also provides: 




RTSP-Based Solutions 829 




Figure 19.1. Unicast Example. Three copies of the same data are sent point-to-point as streams 
Dl, D2, and D3 to receivers 1, 2, and 3. 




Figure 19.2. Broadcast Example. One copy of the same data (D) is sent to all receivers. 




Figure 19.3. Multicast Example. One copy of the same data (Dl) is multicast to receivers 1 and 2. 
Note the bandwidth savings locally and across the networks as the number of receivers increases. 
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Retrieval of program information from a 
server. The client can request a list of avail- 
able programs and their description (pro- 
gram description) via a web browser (HTTP) 
or other technique. If a program is being mul- 
ticast, the program description also contains 
the multicast addresses and ports used. If a 
program is to be sent to only one client (uni- 
cast) , the client provides the destination 
address. 

Invitation of a server to a conference. A server 
can be invited to join an existing conference, 
either to provide or record data. 

Adding media to an existing presentation. Par- 
ticularly useful for live presentations, this 
enables a server to inform a client if addi- 
tional data is available. 



RTSP versus HTTP 

RTSP provides the same services for 
streaming audio and video as HTTP does for 
text and graphics when browsing the web. It is 
designed to have a similar syntax and opera- 
tion, enabling most HTTP extensions to be eas- 
ily adopted to RTSP For example, the RTSP 
URL 

rtsp://media.example.com:554/twister 

identifies the presentation “twister,” which 
may be composed of audio and video streams. 
The RTSP URL 

rtsp://media.example.com:554/twister/audio 

identifies the audio stream within the presenta- 
tion “twister,” which can be controlled via 
RTSP requests to port 554 of server 
media.example.com. 



There is some overlap in functionality 
between RTSP and HTTP since the user inter- 
face is often implemented using web pages. 
For this reason, RTSP supports different hand- 
off points between a web and video server. For 
example, the presentation description can be 
retrieved using HTTP or RTSP, allowing stan- 
dalone RTSP servers and clients which do not 
support HTTP. Figure 19.4 illustrates using a 
web server for the presentation and a separate 
video server for the content. 

RTSP differs from HTTP in two major 
areas. First, unlike HTTP, an RTSP-compatible 
video server has to maintain session states in 
order to correlate RTSP requests with a 
stream. Second, while HTTP is basically an 
asymmetric protocol (the client issues 
requests and the server responds), both the 
video server and client can issue requests with 
RTSP. For example, the video server can issue 
a request to set the playback parameters of a 
stream. 

Stream Properties 

The properties of a stream are defined in a 
presentation description file, which may include 
the encoding format, language, RTSP URLs, 
destination address, port, and other parame- 
ters. The presentation description file is 
obtained by the client using HTTP or other 
means. RTSP requests are usually sent on a 
channel independent of the data channel. 

RTP 

The Real-Time Transport Protocol (RTP) 
is a packet-based protocol for the transfer of 
real-time data, such as audio and video 
streams. Designed primarily for multicast, 
RTP can be also used for unicast and video-on- 
demand. 
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Packets sent over a network have unpre- 
dictable delay and jitter, complicating the 
streaming of real-time video. To overcome 
these issues, the RTP packet header includes 
timestamping, loss detection, payload identifi- 
cation, source identification, and security. This 
information is used at the applications level to 
implement lost packet recovery, congestion 
control, etc. 

RTP is typically run on top of UDP to make 
use of its multiplexing and checksum func- 
tions. While TCP provides a connection-ori- 
ented and reliable flow between two hosts, 
UDP provides a connectionless (but unreli- 
able) datagram service over the network. UDP 
was chosen as the target transport protocol for 
RTP for two reasons. First, RTP is primarily 
designed for multicast; the connection-ori- 
ented TCP does not scale well and therefore is 
not suitable. Second, for real-time data, reliabil- 
ity is not as important as timely delivery; the 
higher reliability provided by TCP using 
retransmission is not desirable. For example, 
in network congestion, some packets might 



get lost and the application would result in 
lower but acceptable quality. If the protocol 
insists on a reliable transmission, the retrans- 
mitted packets could possibly increase the 
delay, jam the network, and eventually starve 
the receiving application. Figure 19.5 illus- 
trates an RTP packet encapsulated within a 
UDP/IP packet. 

RTP itself does not provide mechanisms to 
ensure timely delivery. It requires support 
from lower layers that control resources in 
switches and routers. RSVP may be used to 
reserve such resources and to provide the 
requested QoS. RTP is also designed to work 
in conjunction with RTCP to get feedback on 
quality of data transmission and information 
about participants in the session. 

RTP is also designed to work in conjunc- 
tion with RTCP to get feedback on quality of 
data transmission and information about par- 
ticipants in the session. 

RTP is a protocol framework that is delib- 
erately not complete. It is open to new payload 
formats and new multimedia software. By add- 




Figure 19.4. Client, Web Server, and Video Server Communications. 
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ing new profile and payload specifications, 
RTP can easily be tailored to new data formats 
and new applications. 

RTP Sessions 

To set up an RTP session, the application 
defines a pair of destination addresses (one 
network address plus two ports for RTP and 
RTCP). In a multimedia session, each medium 
is usually carried using its own RTP session, 
with corresponding RTCP packets reporting 
the reception quality for that session. For 
example, audio and video typically use sepa- 
rate RTP sessions, enabling a receiver to select 
whether or not to receive a particular medium. 

Timestamps 

Timestamping is important for real-time 
applications. The receiver uses timestamps to 
reconstruct the original timing in order to play 
the data at the correct rate. Timestamps are 
also used to synchronize different streams, 
such as audio and video data. However, RTP 
itself is not responsible for the synchroniza- 
tion; this is done at the application level. 

In addition, UDP does not deliver packets 
in a timely order. Therefore, sequence num- 
bers are used to place incoming data packets in 
the correct order and for packet loss detection. 
When a video frame is split into several RTP 
packets, some video formats allow all of them 



to have the same timestamp. Thus, timestamps 
are not enough to ensure packets can be put 
back into the correct sequence. 

Payload Identification 

A payload identifier specifies the type of 
content and compression format. This enables 
the receiver to know how to interpret and 
present the content. Several types of payloads 
are supported: 

Various audio formats, including CELP, linear 
PCM, ADPCM, G.711, G.721, G.722, Dolby® 
Digital, Dolby* Digital Plus, MP3, and so on 

MPEG-1 audio and video elementary streams 

MPEG-1 system streams 

MPEG-2 audio and video elementary streams 

MPEG-2 program and transport streams 

MPEG-4 audio and visual streams 

MPEG-4 OD, BIFS, OCI, and IPMP streams 

JPEG and M-JPEG video streams 

DV (IEC 61834), H.261, and H.263 streams 

SMPTE 421M (VC-1) video streams 

MPEG-4.10 (H.264) video streams 

ASF 



IP 


UDP 


RTP 


RTP 


HEADER 


HEADER 


HEADER 


PAYLOAD 



Figure 19.5. RTP Packet Encapsulated Within a UDP/IP Packet. 
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Additional payload types may be added by 
providing a profile and payload format specifi- 
cation. At any given time of transmission, an 
RTP sender can only send one type of payload, 
although the payload type may change during 
transmission, for example, to adjust to network 
congestion. 

RTCP 

The Real-Time Control Protocol (RTCP) is 
a control protocol designed to work in conjunc- 
tion with RTP. In an RTP session, clients peri- 
odically send RTCP packets to the server to 
convey feedback on quality of data delivery 
and information of membership. 

Five types of RTCP packets to convey con- 
trol information are defined: 

Receiver report. Receiver reports contain 
information about data delivery, including the 
highest packet number received, number of 
packets lost, inter-arrival jitter, and times- 
tamps to calculate the round-trip delay 
between the server and the client. 

Sender report. Sender reports contain the 
receiver report information and information 
on inter-media synchronization, cumulative 
packet counters, and number of bytes sent. 

Source description items. They contain infor- 
mation to describe the sources. 

Bye. Indicates end of participation. 

Application-specific functions. Intended for 
experimental use as new applications and 
new features are developed. 



Through these control information pack- 
ets, RTCP provides: 

QoS monitoring and congestion control. Serv- 
ers can adjust transmission based on the cli- 
ent feedback. Clients can determine whether 
a congestion is local, regional, or global. Net- 
work performance can be evaluated during 
multicast distribution. 

Source identification. In RTP data packets, 
sources are identified by randomly generated 
32-bit identifiers, not convenient for users. 

Source description packets contain textual 
information such as user’s name, telephone 
number, e-mail address, etc. 

Inter-media synchronization. Used in inter- 
media synchronization, such as lip synchroni- 
zation for audio and video. 

Control information scaling. When the num- 
ber of participants increases, steps must be 
taken to prevent the control traffic from over- 
whelming network resources. RTP limits the 
control traffic to 5% of the overall session traf- 
fic. This is enforced by adjusting the RTCP 
generating rate according to the number of 
participants. 

Combined, RTP and RTCP provide the 
necessary functionality and control mecha- 
nisms for transmitting real-time content. How- 
ever, RTP and RTCP themselves are not 
responsible for higher-level tasks such as 
assembly and synchronization. These are done 
at the application level. 
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RSVP 

The Resource Reservation Protocol 
(RSVP) enhances the network with support for 
QoS. 

RSVP is used to set up reservations for net- 
work resources, such as bandwidth. When a 
client requests a specific QoS for its data 
stream, it delivers its request to nodes (or rout- 
ers) along the network path using RSVP. At 
each node, RSVP attempts to make a resource 
reservation for the stream. Once a reservation 
is set up, RSVP is also responsible for maintain- 
ing the requested level of service. 

Reservation requests do not need to travel 
all the way to the server. Instead, each reserva- 
tion request travels upstream until it meets 
another reservation request for the same data 
stream, then merges with that reservation. 
This reservation merging is the primary 
advantage of RSVP: scalability — a large num- 
ber of clients can be added to a multicast with- 
out increasing the data traffic significantly. 
RSVP easily scales to large multicast groups; 
the average protocol overhead decreases as 
the number of participants increases. 

RSVP supports both multicast and unicast, 
and adapts to changing memberships and 
routes. Designed to utilize the robustness of 
current Internet routing algorithms, RSVP 
uses underlying routing protocols to deter- 
mine where it should carry reservation 
requests. As routing changes paths to adapt to 
network changes, RSVP adapts its reservation 
to the new paths. 



ISMA 

The Internet Streaming Media Alliance 
(ISMA) is a nonprofit industry alliance 
founded by Apple Computer, Cisco Systems, 
IBM, Kasenna, Philips, and Sun Microsystems. 
Since its inception, it has received wide indus- 
try support. The mission is to facilitate and 
promote the adoption of an open architecture 
for streaming audio and video over the Inter- 
net. 

The ISMA vl.O specification provides tools 
to stream audio and video over networks at up 
to 1.5 Mbps. It uses MPEG-4 audio/video com- 
pression and IETF protocols (RTP, RTSP, and 
SDP) for content transport and control. 

ISMA vl.O defines two hierarchical pro- 
files: Profile 0 and Profile 1. Profile 1 supports 
all the tools supported by Profile 0, along with 
some additional tools. 

Profile 0 

Profile 0 is aimed at streaming audio and 
video over wireless and narrowband networks 
to devices with limited audio and video capabil- 
ities. 

Video uses MPEG-4.2 SP@L1 (QCIF, 176 x 
144). Audio uses MPEG-4.3 HQ@L2. Up to two 
channels of audio are supported, with a sam- 
pling rate up to 48 kHz. It has a maximum total 
bit-rate of 128 kbps. 

Profile 1 

Profile 1 is aimed at streaming audio and 
video over broadband networks to provide the 
user with a better viewing experience. 
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Video uses MPEG-4.2 ASP@L3 (CIF, 352 x 
288). Audio uses MPEG-4.3 HQ@2. Up to two 
channels of audio are supported, with a sam- 
pling rate up to 48 kHz. It has a maximum total 
bit-rate of 1.5 Mbps. 



Broadcast over IP 

ARIB and DVB transport streams may also 
be transmitted over a broadband IP network. 
The transport stream packets are encapsulated 
in RTP packets and sent via IP multicast to 
receivers. 

For DVB, this is called DVB-IPI or Digital 
Video Broadcasting — Internet Protocol Infra- 
structure. Don’t confuse DVD-IPI with DVB-IP, 
which enables IP services over DVB. 



Conditional Access (DRM) 

For broadband IP networks, conditional 
access is commonly called DRM (Digital 
Rights Management) . 

DRM solutions used in early IPTV deploy- 
ments are similar in principle to DVB Simul- 
crypt. The MPEG decoder chip contained an 
embedded (and usually modified) AES or 
3DES decryption block; an ISO 7816 smart- 
card provided decryption key information to 
the AES/3DES descrambling circuitry. 

Newer DRM solutions do not use a smart- 
card. Software securely running inside the 
MPEG decoder chip replaces the smartcard, 
lowering cost and providing a more secure 
solution. The software DRM solutions also typ- 
ically include the ability to control the usage 
and re-distribution of the content after it has 
been initially received and decrypted. Capabili- 
ties of the DRM can include: 



Turning analog and/or digital video copy pro- 
tection on/off 

Limiting the resolution of the analog video out- 
puts (constrained image) 

Limiting the sample rate and size of the digital 
audio outputs 

Disabling analog and/or digital audio and/or 
video outputs 
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8-VSB See Vestigial Sideband. 

A-VSB To better address the mobile market and compete with DVB-H and DMB, 

ATSC A-VSB will improve dynamic multipath tracking, allow the use of lay- 
ered (hierarchical) modulation, support time division multiplexing and sup- 
port frame slicing. To support improved terrestrial coverage, A-VSB will also 
ease synchronization of broadcast signal timing of different towers in a Single 
Frequency Network (SFN). 

AC-3 Another name for Dolby® Digital compressed audio. 



AC Coupled AC coupling passes an analog video signal through a capacitor to remove any 

DC offset, or the overall voltage level that the video signal rides. One way to 
find the signal is to remove the DC offset by AC coupling, and then do DC res- 
toration to add a known DC offset (one that we selected) . Another reason AC 
coupling is important is that it can remove large (and harmful) DC offsets. 

Active Video The part of the video signal that contains picture information. Most of the 
active video, if not all of it, is visible on the display. 



AFC 



See Automatic Frequency Control. 



AGC 



See Automatic Gain Control. 



Alpha 



See Alpha Channel and Alpha Mix. 
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Alpha Mix 



AM 

AMOL 



Amplitude 

Modulation 

Anti-Alias 

Filter 

ARIB 



ARIB 

STD-BIO 

ARIB 

STD-B20 

ARIB 

STD-B21 

ARIB 

STD-B24 

ARIB 

STD-B25 

ARIB 

STD-B31 



The alpha value is used to control the mixing (or blending) , on a sample-by- 
sample basis, of two images. 

new sample = (alpha) (sample A color) + (1 - alpha) (sample B color) 

Alpha typically has a normalized value of 0 to 1. When you hear about 32-bit 
frame buffers, what this really means is that there are 24 bits of color and 8 
bits of alpha. 

See Amplitude Modulation. 

Abbreviation for Automated Measurement of Lineups. This 480i VBI signal, 
typically on lines 20, 22, 283 and/or 284, is used by Nielson boxes. 

A method of encoding data onto a carrier, such that the amplitude of the car- 
rier is proportional to the data value. 

A lowpass filter used to bandwidth-limit a signal to less than one-half the sam- 
pling rate. 

Abbreviation for Association of Radio Industries and Businesses, a standards 
organization in Japan. The ARIB provides several specifications that form the 
core of the ISDB digital television system used in Japan. 

Japan ISDB-S and ISDB-T digital television service information specification. 



Japan ISDB-S (satellite) digital television system specification. 



Japan ISDB-S and ISDB-T digital television receiver specification. 



Japan ISDB-S and ISDB-T digital television data broadcasting specification. 



Japan ISDB-S and ISDB-T digital television access control specification. 



Japan ISDB-T (terrestrial) digital television system specification. 
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ARIB 

STD-B32 

ARIB 

STD-B38 

Artifacts 



Aspect Ratio 

Asynchronous 

ATSC 

ATSC A/52 

ATSC A/53 
ATSC A/65 

ATSC A/70 
ATSC A/80 

ATSC A/81 

ATSC A/90 



Japan ISDB-S and ISDB-T digital television video coding, audio coding, and 
multiplexing specification. 

Japan ISDB-S and ISDB-T digital television home server specification. 



In the video domain, artifacts are blemishes, noise, snow, spots, whatever. 
When you have an image artifact, something is wrong with the picture from a 
visual standpoint. Don’t confuse this term with not having the display properly 
adjusted. For example, if the hue control is set wrong, the picture will look 
bad, but this is not an artifact. An artifact is some physical disruption of the 
image. 

The ratio of the width of the picture to the height. Displays commonly have a 
4:3 or 16:9 aspect ratio. Program material may have other aspect ratios (such 
as 2.35:1), resulting in its being letterboxed on the display. 

Refers to circuitry without a common clock or timing signal. 

Advanced Television Systems Committee. They defined the HDTV standards 
for the United States. Other countries are also adopting the ATSC HDTV stan- 
dard. 

Defines the Dolby® Digital and Dolby® Digital Plus audio compression stan- 
dards for ATSC HDTV. 

Defines ATSC HDTV for the United States. 

Defines the program and system information protocol (PSIP) for ATSC 
HDTV. 

Defines a standard for the conditional access system for ATSC HDTV. 

Defines a standard for modulation and coding of ATSC data delivered over sat- 
ellite for digital television contribution and distribution applications. 

Describes the transmission system for ATSC Direct-to-Home (DTH) satellite 
broadcast system. 

Defines the data broadcast standard for ATSC. 
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ATSC A/92 



ATSC A/93 



ATSC A/94 



ATSC A/95 



ATSC A/96 



ATSC A/97 
ATSC A/100 



ATSC A/101, 
ATSC A/102 

Audio 

Modulation 

Audio 

Subcarrier 

Automatic 
Frequency 
Control (AFC) 

Automatic 

Gain 

Control (AGC) 
AVC 



Defines the delivery of Internet Protocol (IP) multicast sessions and usage of 
the ATSC A/90 data broadcast standard for IP multicast. 

Defines the transmission of synchronized data elements, and synchronized 
and asynchronous events. 

Defines a Data Application Reference Model, including a binding of applica- 
tion environment facilities onto the ATSC A/90 data broadcast standard. 

Defines the ATSC Transport Stream File System (TSFS) standard for delivery 
of hierarchical name-spaces, directories, and files. It builds on the ATSC A/90 
data service delivery scheme. 

Defines a core suite of protocols to enable remote interactivity in ATSC televi- 
sion environments. 

Defines a Software Download Data Service for ATSC. 

This DTV Application Software Environment (DASE) defines a software layer 
(middleware) that allows programming content and applications to run on a 
common ATSC receiver. 

Defines the Advanced Common Application Platform (ACAP) for ATSC. 



Refers to modifying an audio subcarrier with audio information so that it may 
be mixed with the video information and transmitted. 

A specific frequency that is modulated with audio data. 



A technique to lock onto and track a desired frequency. 



A circuit that has a constant output amplitude, regardless of the input ampli- 
tude. 



Abbreviation for Advanced Video Codec, an early name for the MPEG-4.10 
(H.264) video codec. 
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AVCHD 



AVS 

Back Porch 



Bandpass 

Filter 

Bandwidth 

(BW) 

Bandwidth 

Segmented 

Orthogonal 

Frequency 

Division 

Multiplexing 

Baseband 



Black Burst 



Black Level 



Blanking 



High-definition camcorder specification that uses MPEG-4.10 (H.264) video 
compression. 

Abbreviation for Audio Video coding Standard, developed in China. 

The portion of the analog video waveform between the trailing edge of the 
horizontal sync and the start of active video. 

A circuit that allows only a selected range of frequencies to pass through. 



The range of frequencies a circuit will respond to or pass through. It may also 
be the difference between the highest and lowest frequencies of a signal. 

BST-OFDM attempts to improve on COFDM by modulating some OFDM car- 
riers differently from others within the same multiplex. A given transmission 
channel may therefore be segmented, with different segments being modu- 
lated differently. 



When applied to audio and video, baseband means an audio or video signal 
that is not modulated onto another carrier (such as RF modulated to channel 3 
or 4, for example). In DTV, baseband also may refer to the compressed 
(unmodulated) bitstream. 

Black burst is a composite video signal with a totally black picture. It is used to 
synchronize video equipment so the video outputs are aligned. Black burst 
tells the video equipment the vertical sync, horizontal sync and the chroma 
burst timing. 

This level defines what “black” is for a particular video system. If for some rea- 
son the video goes below this level, it is referred to as blacker-than-black. You 
could say that analog sync is blacker-than-black. 

On a CRT display, the scan line moves from the left edge to the right edge, 
jumps back to the left edge, and starts out all over again, on down the screen. 
When the scan line hits the right side and is about to be brought back to the 
left side, the video signal is blanked so that you can’t “see” the return path of 
the scan beam from the right to the left-hand edge. To blank the video signal, 
the analog video level is brought down to the blanking level, which is below 
the black level if a pedestal is used. 
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Blanking Level 

Blooming 

Breezeway 

Brightness 

BS. 707 
BST-OFDM 

BT. 470 

BT.601 

BT.653 

BT.656 

BT.709 

BT.1119 



That level of the video signal defined by the system to be where blanking 
occurs. This could be the black level if a pedestal is not used or below the 
black level if a pedestal is used. 

This is an effect, sometimes caused when video becomes whiter-than-white, in 
which a line that is supposed to be nice and thin becomes fat and fuzzy on the 
screen. 

That portion of the analog video waveform between the trailing edge of hori- 
zontal sync and the start of color burst. 

This refers to how much light is emitted by the display, and is controlled by 
the intensity of the video level. 

This ITU recommendation specifies the stereo audio specifications (Zweiton 
and NICAM 728) for the PAL and SECAM video standards. 

Abbreviation for Bandwidth Segmented Orthogonal Frequency Division Mul- 
tiplexing. 

This ITU recommendation specifies the various NTSC, PAL and SECAM 
video standards used around the world. SMPTE 170M also specifies the (M) 
NTSC video standard used in the United States. BT.470 has replaced BT.624. 

This ITU recommendation specifies the 720 x 480 (59.94 Hz) , 960 x 480 (59.94 
Hz), 720 x 576 (50 Hz) and 960 x 576 (50 Hz) 4:2:2 YCbCr interlaced stan- 
dards. 

This ITU recommendation defines the various teletext standards used around 
the world. Systems A, B, C and D for both 480i and 576i video systems are 
defined. 

This ITU recommendation defines a parallel interface (8-bit or 10-bit, 27 MHz) 
and a serial interface (270 Mbps) for the transmission of 4:3 BT.601 4:2:2 
YCbCr digital video between pro-video equipment. Also see SMPTE 125M. 

This ITU recommendation specifies the 1920 x 1080 R G B' and 4:2:2 YCbCr 
interlaced and progressive 16:9 digital video standards. Frame rates of 60, 
59.94, 50, 30, 29.97, 25, 24 and 23.976 Hz are supported. 

This ITU recommendation defines the widescreen signaling (WSS) informa- 
tion for 480i and 576i video signals. For 576i video systems, WSS may be 
present on line 23, and on lines 22 and 285 for 480i video systems. 
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BT.1124 

BT.1197 

BT.1358 

BT.1618 

BT.1620 

BTSC 

Burst 

Burst Gate 

B-Y 

Carrier 

CBR 

CCIR 



This ITU recommendation defines the ghost cancellation reference (GCR) 
signal for NTSC and PAL. 

This ITU recommendation defines the PALplus standard, allowing the trans- 
mission of 16:9 programs over normal PAL transmission systems. 

This ITU recommendation defines the 720 x 480 (59.94 Hz) and 720 x 576 (50 
Hz) 4:2:2 YCbCr pro-video progressive standards. Also see SMPTE 293M. 

This ITU recommendation specifies a data structure for DY-based audio, data, 
and compressed video at data rates of 25 and 50 Mbps. Also see SMPTE 
314M. 

This ITU recommendation specifies a data structure for DV-based audio, data, 
and compressed video at data rates of 100 Mbps. Also see SMPTE 370M. 

This EIA TVSB5 standard defines a technique of implementing stereo audio 
for NTSC video. One FM subcarrier transmits an L+R signal, and an AM sub- 
carrier transmits an L-R signal. 

See Color Burst. 

This is a signal that tells a NSTC or PAL video decoder where the color burst 
is located within the scan line. 

The blue-minus-luma signal, also called a color difference signal. When added 
to the luma (Y) signal, it produces the blue video signal. 

A frequency that is modulated with data to be transmitted. 

Abbreviation for constant bit-rate. 

Comite Consultatif International des Radiocommunications or International 
Radio Consultative Committee. The CCIR no longer exists — it has been 
absorbed into the parent body, the ITU. For a given “CCIR xxx” specification, 
see “BT.xxx.” 
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CEA-608 

CEA-708 

CEA-805 

CEA-861 

CGMS-A 

Chaoji 

VideoCD 

Chroma 

Chroma 

Bandpass 



United States closed captioning and extended data services (XDS) standard. 
Revision B added Copy Generation Management System — Analog (CGMS-A) , 
content advisory (V-chip), Internet Uniform Resource Locators (URLs) using 
Text-2 (T-2) service, 16-bit Transmission Signal Identifier, and transmission of 
DTV PSIP data. 

DTV closed captioning standard. EIA CEB-8 also provides guidance on the 
use and processing of CEA-608 data streams embedded within ATSC streams, 
and augments CEA-708. 

This standard specifies how CGMS and AMOL data are carried on various 
analog video signals. 

The standard specifies how to include data, such as aspect ratio and format 
information, on HD MI. 

Copy Generation Management System — Analog. See CEA-608 and CEA-805. 
Another name for Super VideoCD. 



A video signal contains two parts that make up what you see on the display: 
the intensity part and the color part. Chroma is the color part. 

In a NTSC or PAL video signal, the luma (black and white) and the chroma 
(color) information is combined together. If you want to decode an NTSC or 
PAL video signal, the luma and chroma must be separated. A chroma band- 
pass filter removes the luma from the video signal, leaving the chroma rela- 
tively intact. This works reasonably well except in images where the luma and 
chroma information overlap, meaning that we have luma and chroma stuff at 
the same frequency. The filter can’t tell the difference between the two and 
passes everything. This can make for a funny-looking picture. Next time 
you’re watching TV and someone is wearing a herringbone jacket or a shirt 
with thin, closely spaced stripes, take a good look. You may see a rainbow 
color effect moving through that area. What’s happening is that the video 
decoder thinks that the luma is chroma. Since the luma isn’t chroma, the 
video decoder can’t figure out what color it is and it shows up as a rainbow pat- 
tern. This problem can be overcome by using a comb filter. 
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Chroma Burst 

Chroma 

Demodulator 

Chroma Key 



Chroma Trap 



Chrominance 



See Color Burst. 

After the NTSC or PAL video signal makes its way through the Y/ C separator, 
the colors must be decoded. That’s what a chroma demodulator does. It takes 
the chroma output of the Y/ C separator and recovers two color difference sig- 
nals (typically I and Q or U and V). Now, with the luma information and two 
color difference signals, the video system can figure out what colors to dis- 
play. 

This is a method of combining two video images. An example of chroma key- 
ing in action is the nightly news person standing in front of a giant weather 
map. In actuality, the person is standing in front of a blue or green background 
and that image is mixed with a computer-generated weather map. This is how 
it works: a TV camera is pointed at the person and fed along with the image of 
the weather map into a box. Inside the box, a decision is made. Wherever it 
sees the blue or green background, it displays the weather map. Otherwise, it 
shows the person. So, whenever the person moves around, the box figures out 
where he or she is and displays the appropriate image. 

In a NTSC or PAL video signal, the luma (black and white) and the chroma 
(color) information is combined together. If you want to decode the video sig- 
nal, the luma and chroma must be separated. The chroma trap is one method 
for separating the chroma from the luma, leaving the luma relatively intact. 
How does it work? The NTSC or PAL signal is fed to a trap filter. For all practi- 
cal purposes, a trap filter allows certain frequencies to pass through, but not 
others. The trap filter is designed with a response to remove the chroma so 
that the output of the filter only contains the luma. Since this trap stops 
chroma, it’s called a chroma trap. The sad part about all of this is that not only 
does the filter remove chroma, it removes luma as well if it exists within the 
frequencies where the trap exists. The filter only knows ranges and, depend- 
ing on the image, the luma information may overlap the chroma information. 
The filter can’t tell the difference between the luma and chroma, so it traps 
both when they are in the same range. What’s the big deal? Well, you lose 
luma and this means that the picture is degraded somewhat. Using a comb fil- 
ter for a Y/C separator is better than a chroma trap or chroma bandpass. 

In video, the terms chrominance and chroma are commonly (and incorrectly) 
interchanged. See the definition of Chroma. 
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CIF 



Clamp 



Clipping Logic 



Closed 

Captioning 



Closed 

Subtitles 

Coded 

Orthogonal 

Frequency 

Division 

Multiplexing 



COFDM 
Color Bars 



Common Interface Format or Common Image Format. It has an active resolu- 
tion of 352 x 240 or 352 x 288. Variations of the CIF format include 2CIF (704 x 
240 or 704 x 288 resolution), DCIF (528 x 320 or 528 x 384 resolution), 4CIF 
(704 x 480 or 704 x 576 resolution), and HD-CIF (1920 x 1080 resolution). 

This is basically another name for the DC-restoration circuit. It can also refer 
to a switch used within the DC-restoration circuit. When it means DC restora- 
tion, then it’s usually used as “clamping.” When it’s the switch, then it’s just 
“clamp.” 

A circuit used to prevent illegal conversion. Some colors can exist in one color 
space but not in another. Right after the conversion from one color space to 
another, a color space converter might check for illegal colors. If any appears, 
the clipping logic is used to limit, or clip, part of the information until a legal 
color can be represented. Since this circuit clips off some information and is 
built using logic, it's not too hard to see how the name “clipping logic” was 
developed. 

A service which decodes text information transmitted with the video signal 
and displays it on the display. The two major closed captioning specifications 
are CEA-608 and CEA-708. 

See Subtitles. 



Coded orthogonal frequency division multiplexing, or COFDM, transmits dig- 
ital data differently from 8-VSB or other single-carrier approaches. Frequency 
division multiplexing means that the data to be transmitted is distributed over 
many carriers (1705 or 6817 for DVB-T), as opposed to modulating a single 
carrier. Thus, the data rate on each COFDM carrier is much lower than that 
required of a single carrier. The COFDM carriers are orthogonal, or mutually 
perpendicular, and forward error correction (“coded”) is used. 

COFDM is a multiplexing technique rather than a modulation technique. 
One of any of the common modulation methods, such as QPSK, 16-QAM, or 
64-QAM, is used to modulate the COFDM carriers. 

See Coded Orthogonal Frequency Division Multiplexing. 

This is a test pattern used to check whether a video system is calibrated cor- 
rectly. A video system is calibrated correctly if the colors are the correct 
brightness, hue and saturation. This can be checked with a vectorscope. 
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Color Burst 



Color Decoder 
Color 

Demodulator 

Color 

Difference 



Color 

Edging 

Color Encoder 

Color Key 
Color Killer 



Color 

Modulator 



An analog waveform of a specific frequency and amplitude that is positioned 
between the trailing edge of horizontal sync and the start of active video. The 
color burst tells the NTSC or PAL video decoder how to decode the color 
information contained in that line of active video. By looking at the color burst, 
the decoder can determine what’s blue, orange, or magenta. Essentially, the 
decoder figures out what the correct color is. 

See Chroma Demodulator. 

See Chroma Demodulator. 



All of the color spaces used in color video require three components. These 
might be R'G'B', YIQ, YCbCr, YPbPr, YUV, or Y(R-Y)(B-Y). In the Y(R- 
Y) (B-Y) color space, the R-Y and B-Y components are often referred to as 
color difference signals for obvious reasons. They are made by subtracting the 
luma (Y) from the red and blue components. I and Q and U and V are also 
color difference signals since they are scaled versions of R-Y and B-Y. All the 
Ys in each of the YIQ, YUV, and Y(R-Y)(B-Y) are basically the same, 
although they are slightly different between SDTV and HDTV. 

Extraneous colors that appear along the edges of objects, but don’t have a 
color relationship to those areas. 

The color encoder does the exact opposite of the color decoder. It takes two 
color difference signals, such as I and Q or U and V, and combines them into a 
chroma signal. 

This is essentially the same thing as chroma key. 

A color killer is a circuit that shuts off the color decoding if the incoming ana- 
log video does not contain color information. How does this work? The color 
killer looks for the color burst and if it can’t find it, it shuts off the color decod- 
ing. For example, let’s say that a color TV is going to receive material 
recorded in black and white. Since the black and white signal does not contain 
a color burst, the color decoding is shut off. Why is a color killer used? Well, in 
the old days, the color decoder would still generate a tiny little bit of color if a 
black and white transmission was received, due to small errors in the color 
decoder, causing a black and white program to have faint color spots through- 
out the picture. 

See Color Encoder. 
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Color Purity 



Color Space 



Color 

Subcarrier 



This term is used to describe how close a color is to the theoretical. For exam- 
ple, in the YUV color space, color purity is specified as a percentage of satura- 
tion and +q, where q is an angle in degrees, and both quantities are referenced 
to the color of interest. The smaller the numbers, the closer the actual color is 
to the color that it’s really supposed to be. For a studio-grade device, the satu- 
ration is +2% and the hue is +2 degrees. On a vectorscope, if you’re in that 
range, you’re studio quality. 

A color space is a mathematical representation for a color. No matter what 
color space is used — R G B', YIQ, YUV, etc. — orange is still orange. What 
changes is how you represent orange. For example, the R G B' color space is 
based on a Cartesian coordinate system and the HSI color space is based on a 
polar coordinate system. 

The color subcarrier is a signal used to control the color NTSC/PAL encoder 
or color decoder. For (M) NTSC, the frequency of the color subcarrier is 
about 3.58 MHz and for (B, D, G, H, I) PAL it’s about 4.43 MHz. In the color 
encoder, a portion of the color subcarrier is used to create the color burst, 
while in the color decoder, the color burst is used to reconstruct a color sub- 
carrier. 



Color 

Temperature 



Comb Filter 



Common 

Image 

Format 



Color temperature is measured in degrees Kelvin. If a TV has a color tempera- 
ture of 8000 degrees Kelvin, that means the whites have the same shade as a 
piece of pure carbon heated to that temperature. Low color temperatures have 
a shift towards red; high color temperatures have a shift towards blue. 

The standard for video is 6500 degrees Kelvin. Thus, professional TV 
monitors use a 6500-degree color temperature. However, most consumer TVs 
have a color temperature of 8000 degrees Kelvin or higher, resulting in a blu- 
ish cast. By adjusting the color temperature of the TV, more accurate colors 
are produced, at the expense of picture brightness. 

This is another method of performing NTSC and PALY/ C separation. A comb 
filter is used in place of a chroma bandpass or chroma trap. The comb filter 
provides better video quality since it does a better job of separating the luma 
from chroma. It reduces the amount of creepy-crawlies or zipper artifacts. It’s 
called a comb filter because the frequency response looks like a comb. The 
important thing to remember is that the comb filter is a better method for Y/ C 
separation than chroma bandpass or chroma trap. 

See CIF. 
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Common 

Interface 

Format 

Component 

Video 

Composite 

Video 



Compression 

Ratio 



See CIF. 



Video using three separate color components, such as YCbCr (digital) , YPbPr 
(analog) or R G B' (digital or analog). 

A single analog video signal that contains brightness, color and timing infor- 
mation. If a video system is to receive video correctly, it must have several 
pieces of the puzzle in place. It must have the picture that is to be displayed on 
the screen, and it must be displayed with the correct colors. This piece is 
called the active video. The video system also needs information that tells it 
where to put each pixel. This is called sync. The display needs to know when 
to shut off the electron beam so the viewer can’t see the spot retrace across 
the CRT display. This piece of the video puzzle is called blanking. Now, each 
piece could be sent in parallel over three separate connections, and it would 
still be called video and would still look good on the screen. This is a waste, 
though, because all three pieces can be combined together so that only one 
connection is needed. Composite video is a video stream that combines all of 
the pieces required for displaying an image into one signal, thus requiring 
only one connection. NTSC and PAL are examples of composite video. Both 
are made up of active video, horizontal sync, horizontal blanking, vertical 
sync, vertical blanking and color burst. RGB' is not an example of composite 
video, even though each red, green and blue signal may contain sync and 
blanking information, because all three signals are required to display the pic- 
ture with the right colors. 

Compression ratio is a number used to tell how much information is squeezed 
out of an image when it has been compressed. For example, suppose we start 
with a 1 MB image and compress it down to 128 kB. The compression ratio 
would be: 

1,048,576/131,072 = 8 

This represents a compression ratio of 8:1; 1/8 of the original amount of 
storage is now required. For a given compression technique — MPEG, for 
example — the higher the compression ratio, the worse the image looks. This 
has nothing to do with which compression method is better, for example, 
JPEG vs. MPEG. A video stream that is compressed using MPEG at 100:1 may 
look better than the same video stream compressed to 100:1 using JPEG. 
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Conditional 

Access 



Constant Bit 
Rate 

Contouring 



Contrast 



Creepy- 

Crawlies 



Cross Color 



This is a technology by which service providers enable subscribers to decode 
and view content. It consists of key decryption (using a key obtained from 
changing coded keys periodically sent with the content) and descrambling. 
The decryption may be proprietary (such as Canal+, DigiCipher, Irdeto 
Access, Nagravision, NDS, Viaccess, etc.) or standardized, such as the DVB 
common scrambling algorithm and OpenCable . Conditional access may be 
thought of as a simple form of digital rights management. 

Two common DVB conditional access (CA) techniques are SimulCrypt 
and MultiCrypt. With SimulCrypt, a single transport stream can contain sev- 
eral CA systems. This enables receivers with different CA systems to receive 
and correctly decode the same video and audio streams. With MultiCrypt, a 
receiver permits the user to manually switch between CA systems. Thus, 
when the viewer is presented with a CA system which is not installed in the 
receiver, he or she simply switches CA cards. 

Constant bit-rate (CBR) means that a bitstream (compressed or uncom- 
pressed) has the same number of bits each second. 

This is an image artifact caused by not having enough bits to represent the 
image. The reason the effect is called “contouring” is because the image 
develops vertical bands of brightness. 

A video term referring to how far the whitest whites are from the blackest 
blacks in a video waveform. If the peak white is far away from the peak black, 
the image is said to have high contrast. With high contrast, the image is very 
stark and very contrasty, like a black-and-white tile floor. If the two are very 
close to each other, the image is said to have poor, or low, contrast. With poor 
contrast, an image may be referred to as being washed out — you can’t tell the 
difference between white and black, and the image looks gray. 

Yes, this is a real video term! Creepy-crawlies refers to a specific image artifact 
that is a result of the NTSC system. When the nightly news is on, and a little 
box containing a picture appears over the anchorperson’s shoulder, or when 
some computer-generated text shows up on top of the video clip being shown, 
get up close to the TV and check it out. Along the edges of the box, or along 
the edges of the text, you’ll notice some jaggies rolling up (or down) the pic- 
ture. That’s the creepy-crawlies. Some people refer to this as zipper because it 
looks like one. 

This occurs when the NTSC or PAL video decoder incorrectly interprets high- 
frequency luma information (brightness) to be chroma information (color), 
resulting in color being displayed where it shouldn’t. 
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This occurs when the NTSC or PAL video decoder incorrectly interprets 
chroma information (color) to be high-frequency luma information (bright- 
ness). 

A condition when one signal erroneously modulates another signal. 



Interference from one signal that is detected on another. 

Abbreviation for Composite Video Baseband Signal or Composite Video, 
Blanking and Synchronization. 

Abbreviation for Digital Audio Visual Council. Its goal was to create an indus- 
try standard for the end-to-end interoperability of broadcast and interactive 
digital audio visual information, and of multimedia communication. The speci- 
fication is now ISO/IEC 16500 (normative part) and ITR 16501 (informative 
part). 

Abbreviation for decibels, a standard unit for expressing relative power, volt- 
age, or current. 

Measure of power in communications. 0 dBm = 1 mW, with a logarithmic rela- 
tionship as the values increase or decrease. In a 50-ohm system, 0 dBm = 
0.223 volts. 

Decibels referenced to 1 watt. 

DC restoration is what you have to do to an analog video signal after it has 
been AC-coupled and has to be digitized. Since the video waveform has been 
AC-coupled, we no longer know absolutely where it is. For example, is the bot- 
tom of the sync tip at -5v or at lv? In fact, not only don’t we know where it is, it 
also changes over time, since the average voltage level of the active video 
changes over time. Since the ADC requires a known input level and range to 
work properly, the video signal needs to be referenced to a known DC level. 
DC restoration essentially adds a known DC level to an AC-coupled signal. In 
decoding video, the DC level used for DC restoration is usually such that 
when the sync tip is digitized, it will generate the number 0. 

This is an abbreviation for Downloadable Conditional Access System. 

This is an abbreviation for Discrete Cosine Transform, used in many video 
compression algorithms. 
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One-tenth of a Bel, used to define the ratio of two powers, voltages, or cur- 
rents, in terms of gains or losses. It is 10x the log of the power ratio and 20x 
the voltage or current ratio. 

When an analog video signal is digitized so that 100 samples are produced, but 
only every other one is stored or used, the signal is decimated by a factor of 
2:1. The image is now 1/4 of its original size, since 3/4 of the data is missing. 
If only one out of five samples were used, then the image would be decimated 
by a factor of 5:1, and the image would be 1/25 its original size. Decimation, 
then, is a quick and easy method for image scaling. 

Decimation can be performed in several ways. One way is the method just 
described, where data is literally thrown away. Even though this technique is 
easy to implement and cheap, it introduces aliasing artifacts. Another method 
is to use a decimation filter, which reduces the aliasing artifacts, but is more 
costly to implement. 

A decimation filter is a lowpass filter designed to provide decimation without 
the aliasing artifacts associated with simply throwing data away. 

Also referred to as post-emphasis and post-equalization. De-emphasis per- 
forms a frequency-response characteristic that is complementary to that intro- 
duced by pre-emphasis. 

A circuit used to restore a frequency response to its original form. 



When 10 or 12 bits per color component are used to represent digital YCbCr 
or R G B' video data. 

The process of recovering an original signal from a modulated carrier. 

In NTSC and PAL video, demodulation is the technique used to recover the 
color difference signals. See the definitions for Chroma Demodulator and 
Color Decoder; these are two other names for the demodulator used in 
NTSC/PAL video applications. Demodulation is also used after DTV tuners to 
convert the transmitted DTV signal to a compressed bitstream. 

Differential gain is how much the color saturation changes when the luma 
level changes (it isn’t supposed to) . For a video system, the better the differen- 
tial gain — that is, the smaller the number specified — the better the system is 
at figuring out the correct color. 
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Differential phase is how much the hue changes when the luma level changes 
(it isn’t supposed to). For a video system, the better the differential phase — 
that is, the smaller the number specified — the better the system is at figuring 
out the correct color. 

Digital video using three separate color components, such as YCbCr or 
RGB'. 



Digital video that is essentially the digitized waveform of NTSC or PAL video 
signals, with specific digital values assigned to the sync, blank and white lev- 
els. 

Digital Rights Management (DRM) is a generic term for a number of capabili- 
ties that allow a content producer or distributor to determine under what con- 
ditions their product can be acquired, stored, viewed, copied, loaned, and so 
on. 

An encryption method (also known as “5C” and “DTCP”) developed by Sony, 
Hitachi, Intel, Matsushita, and Toshiba for IP, USB, and IEEE 1394 interfaces. 



See DVD-Video and DVD-Audio. 



A DCT is just another way to represent an image. Instead of looking at it in the 
time domain — which, by the way, is how we normally do it — it is viewed in the 
frequency domain. It’s analogous to color spaces, where the color is still the 
color but is represented differently. Same thing applies here — the image is 
still the image, but it is represented in a different way. 

Why do JPEG, MPEG, H.261 and H.263 base part of their compression 
schemes on the DCT? Because it is more efficient to represent an image that 
way. In the same way that the YCbCr color space is more efficient than RGB in 
representing an image, the DCT is even more efficient at image representa- 
tion. 

A discrete time oscillator is a digital version of the voltage-controlled oscilla- 
tor. 
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Technique developed by Dotcast for data broadcasting within the NTSC video 
signal. It supports up to 4.5 Mbps per analog TV channel. 

Audio compression techniques developed by Dolby®. Both are multi-channel 
surround sound formats used in DVD, HD DVD, Blu-ray and DTV. 



The distance between screen pixels measured in millimeters. The smaller the 
number, the better the horizontal resolution. 

As the name implies, you are using two buffers — for video, this means two 
frame buffers. While buffer 1 is being read, buffer 2 is being written to. When 
finished, buffer 2 is read out while buffer 1 is being written to. 

A circuit used to change a high-frequency signal to a lower frequency. 

The frequency satellites use to transmit data to earth stations. 

See Digital Rights Management. 

Abbreviation for Digital Multimedia Broadcasting, developed in Korea. It is 
broadcast via satellite (DMB-S) and terrestrial (DMB-T), and uses MPEG-4.10 
(H.264) for the video and MPEG-4.3 BSAC or HE-AACv2 for the audio. The 
audio and video are encapsulated in an MPEG-2 transport stream. DMB is an 
ETSI standard (TS 102 427 and TS 102 428). 

This method is identical to the sync suppression technique for scrambling 
analog TV channels, except there is no suppression of the horizontal blanking 
intervals. Sync pulse suppression only takes place during the vertical blanking 
interval. The descrambling pulses still go out for the horizontal blanking inter- 
vals (to fool unauthorized descrambling devices) . If a descrambling device is 
triggering on descrambling pulses only, and does not know that the scrambler 
is using the drop field scrambling technique, it will try to reinsert the horizon- 
tal intervals (which were never suppressed) . This is known as double reinser- 
tion, which causes compression of the active video signal. An unauthorized 
descrambling device creates a washed-out picture and loss of neutral sync 
during drop field scrambling. 

Abbreviation for Digital Transmission Content Protection. 
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DTS® stands for Digital Theater Systems. It is a multi-channel surround sound 
format, similar to Dolby® Digital. For DVDs that use DTS® audio, the DVD- 
Video specification still requires that PCM or Dolby® Digital audio still be 
present. In this situation, only two channels of Dolby® Digital audio may be 
present (due to bandwidth limitations) . 

Abbreviation for Digital Television, including SDTV, EDTV and HDTV. 

Abbreviation for Digital Video, the standard used for digital camcorders that 
record on tape. It is defined by the BT.1618, BT.1620, IEC 61834, SMPTE 
314M, and 370M specifications. 

Abbreviation for digital video broadcast, a method of transmitting digital audio 
and video. There are several variations: DVB-T for terrestrial broadcasting, 
DVB-S for satellite broadcasting, DVB-C for cable broadcasting, DVB-H and 
DVB-SH for handheld devices and DVB-IP for IPTV applications. 

DVDs that contain linear PCM audio data in any combination of 44.1, 48.0, 
88.2, 96.0, 176.4, or 192 kHz sample rates, 16, 20, or 24 bits per sample, and 1 
to 6 channels, subject to a maximum bit-rate of 9.6 Mbps. With a 176.4 or 192 
kHz sample rate, only two channels are allowed. 

Meridian Lossless Packing (MLP) is a lossless compression method that 
has an approximate 2:1 compression ratio. The use of MLP is optional, but the 
decoding capability is mandatory on all DVD-Audio players. 

Dolby® Digital compressed audio is required for any video portion of a 
DVD-Audio disc. 

DVDs that contain about 2 hours of digital audio, video and data. The video is 
compressed and stored using MPEG-2 MP@ML. A variable bit-rate is used, 
with an average of about 4 Mbps (video only) , and a peak of 10 Mbps (audio 
and video). The audio is either linear PCM or Dolby® Digital compressed 
audio. DTS® compressed audio may also be used as an option. 

Linear PCM audio can be sampled at 48 or 96 kHz, 16, 20, or 24 bits per 
sample, and 1 to 8 channels. The maximum bit-rate is 6.144 Mbps, which lim- 
its sample rates and bit sizes in some cases. 

For Dolby® Digital audio, the bit-rate is 64 to 448 kbps, with 384 kbps 
being the normal rate for 5.1 channels and 192 kbps being the normal rate for 
stereo. The channel combinations are (front/ surround) : 1/0, 1+1/0 (dual 
mono), 2/0, 3/0, 2/1, 3/1, 2/2 and 3/2. The LFE channel (0.1) is optional with 
all 8 combinations. 

For DTS® audio, the bit-rate is 64 to 1536 kbps. The channel combinations 
are (front/ surround) : 1/0, 2/0, 3/0, 2/1, 2/2, 3/2. The LFE channel (0.1) is 
optional with all 6 combinations. 
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DVI Abbreviation for Digital Visual Interface. This is a digital video interface to a 

display, designed to replace the analog YPbPr or RGB' interface. For analog 
displays, the D/A conversion resides in the display. The CEA-861 standard 
specifies how to include data such as aspect ratio and format information. The 
VESA EEDID and DIEXT standards document data structures and mecha- 
nisms to communicate data across DVI. 

DVI-D is a digital-only connector; a DVI-I connector handles both analog 
and digital. DVI-A is available as a plug (male) connector only and mates to 
the analog-only pins of a DVI-I connector. DVI-A is only used in adapter cables, 
where there is the need to convert to or from a traditional analog VGA signal. 

DVITC Abbreviation for Digital Vertical Interval Timecode. 

Dynamic The weakest to the strongest signal a circuit will accept as input or generate as 

Range an output. 

E-AC-3 Another name for Dolby® Digital Plus compressed audio. 

E-VSB A second, more robust ATSC channel. 

EDTV See Enhanced Definition Television. 

EIA Abbreviation for Electronics Industries Alliance. 

EIA-516 United States teletext standard, also called NABTS. 

EIA-744 Defines the V-chip operation. This standard added content advisory filtering 

capabilities to the CEA-608 closed captioning standard. It is now included in 
the latest CEA-608 standard, and has been withdrawn. 

EIA-761 Specifies how to convert QAM to 8-VSB, with support for OSD (on screen dis- 

plays). 

EIA-762 Specifies how to convert QAM to 8-VSB, with no support for OSD (on screen 

displays) . 

EIA-766 United States HDTV content advisory standard. 

EIA-770 This specification consists of three parts. EIA-770.1 and EIA-770.2 define the 

analog YPbPr video interface for 480i video systems. EIA-770.3 defines the 
analog YPbPr video interface for HDTV systems. CEA-805 defines how to 
transfer VBI data over these YPbPr video interfaces. 
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EIA-775 defines a specification for a baseband digital interface to a DTV using 
IEEE 1394 and provides a level of functionality that is similar to the analog sys- 
tem. It is designed to enable interoperability between a DTV and various types 
of consumer digital audio/video sources, including set-top boxes and DVRs or 
VCRs. 

EIA-775. 1 adds mechanisms to allow a source of MPEG services to utilize 
the MPEG decoding and display capabilities in a DTV. 

EIA-775.2 adds information on how a digital storage device, such as a D- 
VHS or hard disk digital recorder, may be used by the DTV or by another 
source device such as a cable set-top box to record or time-shift digital televi- 
sion signals. This standard supports the use of such storage devices by defin- 
ing Service Selection Information (SSI), methods for managing 
discontinuities that occur during recording and playback and rules for man- 
agement of partial transport streams. 

EIA-849 specifies profiles for various applications of the EIA-775 standard, 
including digital streams compliant with ATSC terrestrial broadcast, direct- 
broadcast satellite (DBS) , OpenCable™ and standard-definition Digital Video 
(DV) camcorders. 

This EIA-J recommendation specifies another widescreen signaling (WSS) 
standard for 480i video signals. WSS may be present on lines 20 and 283. 

EDTV is content or a display capable of displaying a maximum of 576 progres- 
sive active scan lines. No aspect ratio is specified. 



These are two groups of pulses, one that occurs before the serrated vertical 
sync and another group that occurs after. These pulses happen at twice the 
normal horizontal scan rate. They exist to ensure correct 2:1 interlacing in 
early televisions. 

The ability to hide transmission errors that corrupt the content beyond the 
ability of the receiver to properly display it. Techniques for video include 
replacing the corrupt region with either earlier video data, interpolated video 
data from previous and next frames, or interpolated data from neighboring 
areas within the current frame. Decompressed video may also be processed 
using deblocking and mosquito filters to reduce artifacts. Techniques for 
audio include replacing the corrupt region with interpolated audio data. 
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The ability to handle transmission errors without corrupting the content 
beyond the ability of the receiver to properly display it. MPEG-4 supports 
error resilience through the use of resynchronization markers, extended 
header code, data partitioning and reversible VLCs. 

This specification defines NICAM 728 digital audio for PAL. 



This specification defines information sent during the vertical blanking inter- 
val using PAL teletext (ETSI EN 300 706) to control VCRs in Europe (PDC) . 

Defines the widescreen signaling (WSS) information for 576i video signals. 
For 576i video systems, WSS may be present on line 23. 

This is the DVB-S specification. 



This is the DVB-C specification. 



This is the DVB SI (service information) specification. 



This is the specification for the carriage of teletext data (ETSI EN 300 706) in 
DVB bitstreams. 

This is the enhanced PAL teletext specification. 

This specification defines data transmission using PAL teletext (ETSI EN 300 
706). 

This is the DVB subtitling specification. 



This is the DVB-T specification. 



This is the DVB data broadcasting specification. 



This is the specification for the carriage of vertical blanking information (VBI) 
data in DVB bitstreams. 
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This is the DVB-H specification for handheld markets. 

This is the DVB-S2 specification. 

Defines the PALplus standard, allowing the transmission of 16:9 programs 
over normal PAL transmission systems. 

Defines the ghost cancellation reference (GCR) signal for PAL. 



Fading is a method of switching from one video source to another. Next time 
you watch a TV program (or a movie) , pay extra attention when the scene is 
about to end and go on to another. The scene fades to black, then a fade from 
black to another scene occurs. Fading between scenes without going to black 
is called a dissolve. One way to do a fade is to use an Alpha Mixer. 

An interlaced display is made using two fields, each one containing one-half of 
the scan lines needed to make up one frame of video. Each field is displayed in 
its entirety — therefore, the odd field is displayed, then the even, then the odd, 
and so on. Fields only exist for interlaced scanning systems. So for 480i video 
systems, which have 525 lines per frame, a field has 262.5 lines, and two fields 
make up a 525-line frame. 

Flicker occurs when the frame rate of the video is too low. It’s the same effect 
produced by a fluorescent light fixture. The two problems with flicker are that 
it’s distracting and tiring to the eyes. 

See Frequency Modulation. 

A frame of video is essentially one picture or still out of a video stream. By 
playing these individual frames fast enough, it looks like people are moving on 
the screen. It’s the same principle as flip cards, cartoons and movies. 

A frame buffer is a memory used to hold an image for display. How much 
memory are we talking about? Well, let’s assume a horizontal resolution of 640 
pixels and 480 scan lines, and we’ll use the R G B' color space. This works out 
to be: 

640 x 480 x 3 = 921,600 bytes or 900 kB 



So, 900 kB are needed to store one frame of video at that resolution. 
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The frame rate of a video source is how fast a new still image is available. 576i 
and 480i displays originally used 25 and 30 frames per second, respectively. 
Refresh rates of 50, 60, 100, and 120 frames per second are now common. 

Frame rate conversion is the act of converting one frame rate to another. 



This is the area of the analog video waveform that sits between the start of 
horizontal blanking and the start of horizontal sync. 

FVD (Finalized Versatile Disc) is an increased-capacity red-laser DVD disc 
and player specification from Taiwan. It uses the WMV9 and WMA9 codecs. 
The 5.4GB/9.8GB FVD-1 disc supports up to 135 minutes of WMV9 720p con- 
tent. The 6GB/11GB FVD-2 disc will support up to 1080i content. 

The transfer characteristics of most cameras and displays are nonlinear. For a 
display, a small change in amplitude when the signal level is small produces a 
change in the display brightness level, but the same change in amplitude at a 
high level will not produce the same magnitude of brightness change. This 
nonlinearity is known as gamma. 

Before being displayed, linear RGB data must be processed (gamma cor- 
rected) to compensate for the nonlinearity of the display. 

See Ghost Cancellation Reference Signal. 

An analog video signal provides all of the information necessary for a video 
decoder to reconstruct the picture. This includes brightness, color and timing 
information. To properly decode the video signal, the video decoder must lock 
to all the timing information embedded within the video signal, including the 
color burst, horizontal sync and vertical sync. The decoder looks at the color 
burst of the video signal and reconstructs the original color subcarrier that 
was used by the encoder. This is needed to decode the color information prop- 
erly. It also generates a sample clock (done by looking at the sync information 
within the video signal), used to clock pixel data out of the decoder into a 
memory or another circuit for processing. The circuitry within the decoder 
that does all of this work is called the genlock circuit. Although it sounds sim- 
ple, the genlock circuit must be able to handle very bad video sources, such as 
the output of VCRs, cameras and toys. In reality, the genlock circuit is the 
most complex section of a video decoder. 
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A reference signal on (M) NTSC scan lines 19 and 282 and (B, D, G, H, I) PAL 
scan line 318 that allows the removal of ghosting from TVs. Filtering is 
employed to process the transmitted GCR signal and determine how to filter 
the entire video signal to remove the ghosting. ITU-R BT.1124 and ETSI ETS 
300 732 define the standard each country uses. ATSC A/49 also defines the 
standard for NTSC. 

The term gray scale has several meanings. It some cases, it means the luma 
component of color video signals. In other cases, it means a black-and-white 
video signal. 

The ITU-T H.261 and H.263 video compression standards were developed to 
implement video conferencing over ISDN, LANs, regular phone lines, etc. 
H.261 supports video resolutions of 352 x 288 and 176 x 144 at up to 29.97 
frames per second. H.263 supports video resolutions of 1408 x 1152, 704 x 576, 
352 x 288, 176 x 144 and 128 x 96 at up to 29.97 frames per second. 

The “next-generation” video codec. Previously known as “H.26L,” “JVT,” and 
“AVC” (advanced video codec), it is now also an MPEG-4.10 standard. 

ITU-T H.264 offers bit-rates up to 50% less than the MPEG-4 advanced 
simple profile (ASP) video codec for the same video quality. It was designed to 
compete with the SMPTE 421M (VC-1) video codec in bit-rate and quality. 

Early name for the H.264 video codec. 

See CIF. 

High Data-Rate Serial Data Transport Interface, defined by SMPTE 348M. 

Abbreviation for High-Definition Multimedia Interface, a single-cable digital 
audio/video interface for consumer equipment. It is designed to replace DVI 
in a backwards-compatible fashion and supports CEA-861 and HDCP. 

Digital RGB or YCbCr data at rates up to 5 Gbps are supported (HDTV 
requires 2.2 Gbps). Up to 8 channels of 32-192 kHz digital audio are also sup- 
ported, along with AV.link (remote control) capability and a smaller 15 mm 19- 
pin connector. 

See High-Definition Television. 

Abbreviation for High-Definition DV, a high-definition digital camcorder for- 
mat. 
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HDTV is capable of displaying at least 720 progressive or 1080 interlaced 
active scan lines. It must be capable of displaying a 16:9 image using at least 
540 progressive or 810 interlaced active scan lines. 

A circuit that passes frequencies above a specific frequency (the cutoff fre- 
quency) . Frequencies below the cutoff frequency are reduced in amplitude to 
eliminate them. 

During the horizontal blanking interval, the video signal is at the blank level 
so as not to display the electron beam when it sweeps back from the right to 
the left side of the CRT screen. 

See Resolution. 



This is how fast the scanning beam in a display sweeps from side to side. In 
the 480i video system, this rate is 63.556 ms, or 15.734 kHz. That means the 
scanning beam moves from side to side 15,734 times a second. 

This is the portion of the video signal that tells the display where to place the 
image in the left-to-right dimension. The horizontal sync pulse tells the receiv- 
ing system where the beginning of the new scan line is. 

This is another name for black burst. 

Check out the Horizontal Sync definition. 

In technical terms, hue refers to the wavelength of the color. That means that 
hue is the term used for the base color — red, green, yellow, etc. Hue is com- 
pletely separate from the intensity or the saturation of the color. For example, 
a red hue could look brown at low saturation, bright red at a higher level of 
saturation, or pink at a high brightness level. All three “colors” have the same 
hue. 

Huffman coding is a method of data compression. It doesn’t matter what the 
data is — it could be image data, audio data, or whatever. It just so happens that 
Huffman coding is one of the techniques used in JPEG, MPEG, H.261 and 
H.263 to help with the compression. This is how it works. First, take a look at 
the data that needs to be compressed and create a table that lists how many 
times each piece of unique data occurs. Now assign a very small code word to 
the piece of data that occurs most frequently. The next largest code word is 
assigned to the piece of data that occurs next most frequently. This continues 
until all of the unique pieces of data are assigned unique code words of vary- 
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ing lengths. The idea is that data that occurs most frequently is assigned a 
small code word, and data that rarely occurs is assigned a long code word, 
resulting in space savings. 

HVD (High-definition Versatile Disc) is a red-laser DVD disc and player speci- 
fication from China. It supports up to 150 minutes of 720p MPEG-2 content. 
The player supports 10801 and 720p video outputs. 

See Improved Definition Television. 

Defines the longitudinal (LTC) and vertical interval (VITC) timecode for 480i 
and 576i video systems. LTC requires an entire field time to transfer timecode 
information, using a separate track. VITC uses one scan line each field during 
the vertical blanking interval. Also see SMPTE 12M. 

Defines a serial digital audio interface for consumer (SPDIF) and professional 
applications. 

Defines the DV (originally the “Blue Book”) standard. Also see BT.1618 and 
SMPTE 314M. 

Defines the widescreen signaling (WSS) information for 480i video signals. 
WSS may be present on lines 20 and 283. 

Defines the methods for transferring data, audio, DV (IEC 61834) and MPEG- 
2 data over IEEE 1394. 

Defines the Super VideoCD standard. 

A high-speed daisy-chained serial interface. Digital audio, video and data can 
be transferred with either a guaranteed bandwidth or a guaranteed latency. It 
is hot-pluggable, and uses a small 6-pin or 4-pin connector, with the 6-pin con- 
nector providing power. 

Some colors that exist in the R'G B ' color space can’t be represented in the 
NTSC and PAL video domain. For example, 100% saturated red in the RGB' 
space (which is the red color on full strength and the blue and green colors 
turned off) can’t exist in the NTSC video signal, due to color bandwidth limita- 
tions. The NTSC encoder must be able to determine that an illegal color is 
being generated and stop that from occurring, since it may cause over-satura- 
tion and blooming. 
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IDTV is different from HDTV. IDTV is a system that improves the display on 
TVs by adding processing in the TV; standard NTSC or PAL signals are trans- 
mitted. 



This is the same thing as Brightness. 

An interlaced video system is one where two interleaved fields are used to 
generate one video frame. Therefore, the number of lines in a field is one-half 
of the number of lines in a frame. In 480i video systems, there are 262.5 lines 
per field (525 lines per frame), while there are 312.5 lines per field (625 lines 
per frame) in 576i video systems. Each field is drawn on the screen consecu- 
tively — first one field, then the other. 

Interpolation is a mathematical way of generating additional information. Let’s 
say that an image needs to be scaled up by a factor of two, from 100 samples to 
200 samples. The “missing” samples are generated by calculating (interpolat- 
ing) new samples between two existing samples. After all of the “missing” 
samples have been generated — presto! — 200 samples exist where only 100 
existed before, and the image is twice as big as it used to be. 

An arbitrary unit used to describe the amplitude characteristics of a video sig- 
nal. White is defined to be 100 IRE and the blanking level is defined to be 0 
IRE. 

Abbreviation for Integrated Services Digital Broadcasting, the digital televi- 
sion (DTV) broadcast standard used in Japan. 

See BT.xxx. 

Short-term variations in the characteristics (such as frequency, amplitude, 
etc.) of a signal. 

JPEG stands for Joint Photographic Experts Group. However, what people 
usually mean when they use the term “JPEG” is the image compression stan- 
dard they developed. JPEG was developed to compress still images, such as 
photographs, a single video frame, something scanned into the computer, and 
so forth. You can run JPEG at any speed that the application requires. For a 
still picture database, the algorithm doesn’t have to be very fast. If you run 
JPEG fast enough, you can compress motion video — which means that JPEG 
would have to run at 50 or 60 fields per second. This is called motion JPEG or 
M-JPEG. You might want to do this if you were designing a video editing sys- 
tem. Now, M-JPEG running at 60 fields per second is not as efficient as 
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MPEG-2 running at 60 fields per second because MPEG was designed to take 
advantage of certain aspects of motion video. 

Abbreviation for Joint Video Team, a group of video experts that work on new 
video codecs. It is a collaborative effort between the International Telecommu- 
nication Union (ITU), the International Electrotechnical Commission (IEC), 
and the International Organization for Standardization (ISO) . “JVT” was also 
an early name for the H.264 video codec. 

Abbreviation for kilobits per second. 

Abbreviation for kilobytes per second. 

A design that ensures that there is always a constant number of samples per 
scan line, even if the timing of the line changes. 

A line store is a memory used to hold one scan line of video. If the horizontal 
resolution of the active display is 640 samples and R G B' is used as the color 
space, the line store would have to be 640 locations long by 3 bytes wide. This 
amounts to one location for each sample and each color. Line stores are typi- 
cally used in filtering algorithms. For example, a comb filter is made up of one 
or more line stores. 

Linearity is a basic measurement of how well an ADC or DAC is performing. 
Linearity is typically measured by making the ADC or DAC attempt to gener- 
ate a linearly increasing signal. The actual output is compared to the ideal of 
the output. The difference is a measure of the linearity. The smaller the num- 
ber, the better. Linearity is typically specified as a range or percentage of LSBs 
(Least Significant Bits) . 

When a PLL is accurately producing timing that is precisely lined up with the 
timing of the incoming video source, the PLL is said to be “locked.” When a 
PLL is locked, the PLL is stable and there is minimum jitter in the generated 
sample clock. 

Timecode information that is stored on a separate track from the video, requir- 
ing an entire field time to store or read it. 

Lossless is a term used with compression. Lossless compression is when the 
decompressed data is exactly the same as the original data. It’s lossless 
because you haven’t lost anything. 
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Lossy compression is the exact opposite of lossless. The regenerated data is 
different from the original data. The differences may or may not be noticeable, 
but if the two images are not identical, the compression was lossy. 

A circuit that passes frequencies below a specific frequency (the cutoff fre- 
quency) . Frequencies above the cutoff frequency are reduced in amplitude to 
eliminate them. 

See Longitudinal Timecode. 

As mentioned in the definition of chroma, video systems use a signal that has 
two pieces: the black and white part and the color part. The black and white 
part is the luma. It was the luma component that allowed color TV broadcasts 
to be received by black and white TVs and still remain viewable. 

In video, the terms luminance and luma are commonly (and incorrectly) inter- 
changed. See the definition of Luma. 

Abbreviation for megabits per second. 

Abbreviation for megabytes per second. 

A technique of recording SECAM video. Instead of dividing the FM color sub- 
carrier by four and then multiplying back up on playback, MESECAM uses 
the same heterodyne conversion as PAL. 

See Multimedia Home Platform. 

See Motion JPEG. 

A modulator is basically a circuit that combines two different signals in such a 
way that they can be pulled apart later. What does this have to do with video? 
Let’s take the NTSC system as an example, although the example applies 
equally well to PAL. The NTSC system may use the YIQ or YUV color space, 
with the I and Q or U and V signals containing all of the color information for 
the picture. Two 3.58 MHz color subcarriers (90 degrees out of phase) are 
modulated by the I and Q or U and V components and added together to cre- 
ate the chroma part of the NTSC video. 
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This is a type of image artifact. A moire effect occurs when a pattern is created 
on the display where there really shouldn’t be one. A moire pattern is typically 
generated when two different frequencies beat together to create a new, 
unwanted frequency. 

A monochrome signal is a video source having only one component. Although 
usually meant to be the luma (or black-and-white) video signal, the red video 
signal is also monochrome because it only has one component. 

This is a term that is used to describe ADCs and DACs. An ADC or DAC is 
said to be monotonic if for every increase in input signal, the output increases. 
Any ADC or DAC that is nonmonotonic — meaning that the output decreases 
for an increase in input — is bad! Nobody wants a nonmonotonic ADC or DAC. 

Motion estimation is trying to figure out where an object has moved to from 
one video frame to the other. Why would you want to do that? Well, let’s take 
an example of a video source showing a ball flying through the air. The back- 
ground is a solid color that is different from the color of the ball. In one video 
frame the ball is at one location and in the next video frame the ball has moved 
up and to the right by some amount. Now let’s assume that the video camera 
has just sent the first video frame of the series. Now, instead of sending the 
second frame, wouldn’t it be more efficient to send only the position of the 
ball? Nothing else moves, so only two little numbers would have to be sent. 
This is the essence of motion estimation. By the way, motion estimation is an 
integral part of MPEG, H.261 and H.263. 

JPEG compression or decompression that is applied real-time to video. Each 
field or frame of video is individually processed. 

MPEG stands for Moving Picture Experts Group. This is an ISO/IEC (Inter- 
national Standards Organization) body that is developing various compression 
algorithms. MPEG differs from JPEG in that MPEG takes advantage of the 
redundancy on a frame-to-frame basis of a motion video sequence, whereas 
JPEG does not. 

MPEG-1 was the first MPEG standard defining the compression format for 
real-time audio and video. The video resolution is typically 352 x 240 or 352 x 
288, although higher resolutions are supported. The maximum bit-rate is 
about 1.5 Mbps. MPEG-1 is used for the Video CD format. 
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MPEG-2 extends the MPEG-1 standard to cover a wider range of applications. 
Higher video resolutions are supported to allow for HDTV applications, and 
both progressive and interlaced video are supported. 

MPEG-3 was originally targeted for HDTV applications. This was incorporated 
into MPEG-2, so there is no MPEG-3 standard. 

MPEG-4 (ISO/IEC 14496) supports an object-based approach, where scenes 
are modeled as compositions of objects, both natural and synthetic, with 
which the user may interact. Visual objects in a scene can be described math- 
ematically and given a position in a two- or three-dimensional space. Similarly, 
audio objects can be placed in a sound space. Thus, the video or audio object 
need only be defined once; the viewer can change his or her viewing position, 
and the calculations to update the audio and video are done locally. Classical 
rectangular video, as from a camera, is one of the visual objects supported. In 
addition, there is the ability to map images onto computer-generated shapes 
and a text-to-speech interface. 

H.264, a next-generation video codec, has been included in the MPEG-4 
standard as Part 10. 

Multichannel Television Sound. A generic name for various stereo audio 
implementations, such as BTSC and Zweiton. 

North American Broadcast Teletext Specification (EIA-516). This is also ITU- 
R BT.653 525-line system C teletext. However, the NABTS specification goes 
into much more detail. 

An electronic program guide (EPG) based on ETSI ETS 300 707. 

A technique of implementing digital stereo audio for PAL video using another 
audio subcarrier. The bit-rate is 728 kbps. It is discussed in BS.707 and ETSI 
EN 300 163. NICAM 728 is also used to transmit non-audio digital data in 
China. 

This is a method of scanning out a video display that is the total opposite of 
interlaced. All of the lines in the frame are scanned out sequentially, one right 
after the other. The term “field” does not apply in a noninterlaced system. 
Another term for a noninterlaced system is progressive scan. 

Never Twice the Same Color, Never The Same Color, or National Television 
Standards Committee, depending on who you’re talking to. Technically, NTSC 
is just a color modulation scheme. To specify the color video signal fully, it 
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should be referred to as (M) NTSC. “NTSC” is also commonly (though incor- 
rectly) used to refer to any 525/59.94 or 525/60 video system. See also NTSC 
4.43. 

This is a NTSC video signal that uses the PAL color subcarrier frequency 
(about 4.43 MHz). It was developed by Sony in the 1970s to more easily adapt 
European receivers to accept NTSC signals. 

Abbreviation for Near-Video-On-Demand. See Video-On-Demand. 

Organisation Internationale de Radiodiffusion et Television. 

See Subtitles. 

See Raw VBI Data. 



When an image is displayed, it is overscanned if a small portion of the image 
extends beyond the edges of the screen. Overscan is common in all TVs. 

PAL stands for Phase Alternation Line, Picture Always Lousy, or Perfect At 
Last depending on your viewpoint. Technically, PAL is just a color modulation 
scheme. To fully specify the color video signal it should be referred to as (B, 
D, G, H, I, M, N, or Nq) PAL. (B, D, G, H, I) PAL is the color video standard 
used in Europe and many other countries. (M, N, N c ) PAL is also used in a 
few places, but is not as popular. “PAL” is also commonly (though incorrectly) 
used to refer to any 625/50 video system. See also PAL 60. 

This is a NTSC video signal that uses the PAL color subcarrier frequency 
(about 4.43 MHz) and PAL-type color modulation. It is a further adaptation of 
NTSC 4.43, modifying the color modulation in addition to changing the color 
subcarrier frequency. It was developed by JVC in the 1980s for use with their 
video disc players, hence the early name of “Disk-PAL.” 

There is a little-used variation, also called PAL 60, which is a PAL video 
signal that uses the NTSC color subcarrier frequency (about 3.58 MHz) and 
PAL-type color modulation. 

PALplus is 16:9 aspect ratio version of PAL, designed to be transmitted using 
normal PAL systems. 16:9 TVs without the PALplus decoder and standard 4:3 
TVs show a standard picture. It is defined by BT.1197 and ETSI ETS 300 731. 
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See Program Delivery Control. 

Pedestal is an offset used to separate the black level from the blanking level by 
a small amount. When a video system doesn’t use a pedestal, the black and 
blanking levels are the same. 

This is a term used to describe a method of adjusting the hue in a NTSC video 
signal. The phase of the color subcarrier is moved, or adjusted, relative to the 
color burst. PAL and SECAM systems do not usually have a phase (or hue) 
adjust control. 

A pixel, which is short for picture element, is the smallest sample that makes 
up a scan line. For example, when the horizontal resolution is defined as 640 
pixels, that means that there are 640 individual locations, or samples, that 
make up the visible portion of each horizontal scan line. Pixels may be square 
or rectangular. 

The pixel clock is used to divide the horizontal line of video into samples. The 
pixel clock has to be stable (a very small amount of jitter) relative to the video 
or the image will not be stored correctly. The higher the frequency of the pixel 
clock, the more samples per line there are. 

This can be a real troublemaker, since it can cause artifacts. In some 
instances, a pixel drop-out looks like black spots on the screen, either station- 
ary or moving around. Several things can cause pixel drop-out, such as the 
ADC not digitizing the video correctly. Also, the timing between the ADC and 
the frame buffer might not be correct, causing the wrong number to be stored 
in memory. For that matter, the timing anywhere in the video stream might 
cause a pixel drop-out. 

A set of colors that can be combined to produce any desired set of intermedi- 
ate colors, within a limitation call the “gamut.” The primary colors for color 
television are red, green and blue. The exact red, green and blue colors used 
are dependent on the television standard. Display devices do not usually use 
the same primary colors, resulting in minor color changes from ideal. 

Information sent during the vertical blanking interval using teletext to control 
VCRs in Europe. The specification is ETSI ETS 300 231. 



See Noninterlaced. 
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Pseudo color is a term used to describe a technique that applies color, or 
shows color, where it does not really exist. We are all familiar with the satellite 
photos that show temperature differences across a continent or the multicol- 
ored cloud motion sequences on the nightly weather report. These are real- 
world examples of pseudo color. The color does not really exist. A computer 
uses a lookup table memory to add the color so information, such as tempera- 
ture or cloud height, is viewable. 

This is basically the same as H.261. The term had faded away since H.261 is 
used in applications other than ISDN video conferencing. 

See Quadrature Amplitude Modulation. 

Quarter Common Interface Format. This video format was developed to allow 
the implementation of lower-cost video phones. The QCIF format has a resolu- 
tion of 176 x 144 active pixels and a frame rate of 29.97 frames per second. 

Quarter Standard Interface Format. The computer industry, which uses 
square pixels, has defined QSIF to be 160 x 120 active pixels, with a frame rate 
of whatever the computer is capable of supporting. 

Quad chroma refers to a technique where the sample clock is four times the 
frequency of the color burst. For NTSC this means that the sample clock is 
about 14.32 MHz (4x 3.579545 MHz), while for PAL the sample clock is about 
17.73 MHz (4x 4.43361875 MHz). The reason these are popular sample clock 
frequencies is that, depending on the method chosen, they make the chromi- 
nance (color) decoding and encoding easier. 

A method of encoding digital data onto a carrier for RF transmission. QAM is 
typically used for cable transmission of DTV signals. DVB-C supports 16- 
QAM, 32-QAM, 64-QAM, 128-QAM and 256-QAM, although receivers need 
only support up to 64-QAM. 

The modulation of two carrier components which are 90 degrees apart in 
phase. 

The process of converting a continuous analog signal into a set of discrete lev- 
els (digitizing). 

This is the inherent uncertainty introduced during quantization since only dis- 
crete, rather than continuous, levels are generated. Also called quantization 
distortion. 
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Essentially, a raster is the series of scan lines that make up a picture. You may 
from time to time hear the term raster line — it’s the same as scan line. All of 
the scan lines that make up a frame of video form a raster. 

A technique where VBI data (such as teletext and captioning data) is sampled 
by a fast sample clock (i.e., 27 MHz) and output. This technique allows soft- 
ware decoding of the VBI data to be done. 

Rewritable timecode, used in consumer video products. 

See RTCP. 



See RTSP. 



See RTP. 



Pixels that are not square pixels are rectangular pixels. 



This is the amount of color subcarrier information present in white, gray, or 
black areas of a composite color video signal (ideally, there is none present) . 
The number usually appears as -n dB. The larger “n” is, the better. 

This is the basic measurement of how much information is visible for an 
image. It is usually described as h x v. The h is the horizontal resolution 
(across the display) and the v is the vertical resolution (down the display). 
The higher the numbers, the better, since that means there is more detail to 
see. If only one number is specified, it is the horizontal resolution. 

Displays specify the maximum resolution they can handle, determined by 
the display technology and the electronics used. The actual resolution will be 
the resolution of either the source or the display, whichever is lower. 

Vertical resolution is the number of white-to-black and black-to-white tran- 
sitions that can be seen from the top to the bottom of the picture. The maxi- 
mum number is the number of active scan lines used by the image. The actual 
vertical resolution may be less due to processing, interlacing, overscanning, 
or limited by the source. 
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Horizontal resolution is the number of white-to-black and black-to-white 
transitions that can be seen from the left to the right of the picture. For digital 
displays, the maximum number is the number of active pixels used by a scan 
line. For both analog and digital displays, the actual horizontal resolution may 
be less due to processing, overscanning, or limited by the source. 

See RSVP. 



Retrace is what the electron beam does when it gets to the right-hand edge of 
the CRT display to get back to the left-hand edge. Retrace happens during the 
horizontal blanking time. 

Abbreviation for red, green, blue. RGB is used to denote linear RGB data. 
R G B' is used to denote gamma-corrected RGB data. 

RS-170 is the U.S. standard that was used for black-and-white TV, and defines 
voltage levels, blanking times, the width of the sync pulses, and so forth. The 
specification spells out everything required for a receiver to display a mono- 
chrome picture. Now, SMPTE 170M is essentially the same specification, 
modified for color TV by adding the color components. They modified RS-170 
just a tiny little bit so that color could be added (RS-170A), with the final result 
being SMPTE 170M for NTSC. This tiny little change was so small that the 
existing black-and-white TVs didn’t even notice it. 

RS-343 does the same thing as RS-170, defining a specification for transferring 
analog video, but the difference is that RS-343 is for high-resolution computer 
graphics analog video, while RS-170 is for TV-resolution NTSC analog video. 

RSVP (Resource Reservation Protocol) is a control protocol that allows a 
receiver to request a specific quality of service level over an IP network. Real- 
time applications, such as streaming video, use RSVP to reserve necessary 
resources at routers along the transmission paths so that the requested band- 
width can be available when the transmission actually occurs. 

RTCP (Real-Time Control Protocol) is a control protocol designed to work in 
conjunction with RTP. During an RTP session, participants periodically send 
RTCP packets to convey status on quality of service and membership manage- 
ment. RTCP also uses RSVP to reserve resources to guarantee a given quality 
of service. 
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RTP (Real-Time Transport Protocol) is a packet format and protocol for the 
transport of real-time audio and video data over an IP network. The data may 
be any file format, including MPEG-2, MPEG-4, ASF, QuickTime, etc. Imple- 
menting time reconstruction, loss detection, security, and content identifica- 
tion, it also supports multicasting (one source to many receivers) and 
unicasting (one source to one receiver) of real-time audio and video. One-way 
transport (such as video-on-demand) as well as interactive services (such as 
Internet telephony) is supported. RTP is designed to work in conjunction with 
RTCP. 

RTSP (Real-Time Streaming Protocol) is a client-server protocol to enable con- 
trolled delivery of streaming audio and video over an IP network. It provides 
VCR-style remote control capabilities such as play, pause, fast-forward and 
reverse. The actual data delivery is done using RTP. 

Run-length coding is a type of data compression. Let’s say that this page is 
wide enough to hold a line of 80 characters. Now, imagine a line that is almost 
blank except for a few words. It’s 80 characters long, but it’s just about all 
blanks — let’s say 50 blanks between the words “coding” and “medium.” These 
50 blanks could be stored as 50 individual codes, but that would take up 50 
bytes of storage. An alternative would be to define a special code that said a 
string of blanks is coming and the next number is the number of blanks in the 
string. So, using our example, we would need only 2 bytes to store the string 
of 50 blanks, the first special code byte followed by the number 50. We com- 
pressed the data; 50 bytes down to 2. This is a compression ratio of 25:1. Not 
bad, except that we only compressed one line out of the entire document, so 
we should expect that the total compression ratio would be much less. 

Run-length coding all by itself as applied to images is not as efficient as 
using a DCT for compression, since long runs of the same number rarely exist 
in real-world images. The only advantage of run-length coding over the DCT 
is that it is easier to implement. Even though run-length coding by itself is not 
efficient for compressing images, it is still used as part of the JPEG, MPEG, 
H.261, and H.263 compression schemes. 

In video, the red-minus-luma signal, also called a color difference signal. 
When added to the luma (Y) signal, it produces the red video signal. 

To obtain values of a signal at periodic intervals. Also the value of a signal at a 
given moment in time. 

A circuit that samples a signal and holds the value until the next sample is 
taken. 
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Sample rate is how often a sample of a signal is taken. The sample rate is 
determined by the sample clock. 

See Secondary Audio Program. 

Saturation is the amount of color present. For example, a lightly saturated red 
looks pink, while a fully saturated red looks like the color of a red crayon. Sat- 
uration does not mean the brightness of the color, just how much pigment is 
used to make the color. The less pigment, the less saturated the color is, effec- 
tively adding white to the pure color. 

Terrestrial digital television broadcast system adopted by Brazil. The RF mod- 
ulation is the same as used by ISDB-T. 

Scaling is the act of changing the resolution of an image. For example, scaling 
a 640 x 480 image by one-half results in a 320 x 240 image. Scaling by 2x 
results in an image that is 1280 x 960. There are many different methods for 
image scaling, and some look better than others. In general, though, the bet- 
ter the algorithm looks, the more expensive it is to implement. 

A scan line is an individual line across the display. It takes 525 of these scan 
lines to make up a 480i or 480p TV picture and 625 scan lines to make up a 
576i or 576p TV picture. 

See Velocity Scan Modulation. 



Syndicat des Constructeurs d Appareils Radio Recepteurs et Televiseurs. This 
is a 21-pin connector supported by many consumer video components in 
Europe. It allows mono or stereo audio and composite, S-video, or RGB video 
to be transmitted between equipment. 

The IEC 60933-1 and 60933-2 standards specify the basic SCART connec- 
tor, including signal levels. 

The scRGB color space (formerly called sRGB64) extends the dynamic range, 
color gamut and bit precision over sRGB. The scRGB gamut is not only much 
larger than the sRGB gamut, but it is larger than what the human visual sys- 
tem can see. The specification for scRGB (IEC 61966-2-2) uses BT.709 chro- 
maticity, D65 reference white and linear RGB data (16 bits per color). 

Instead of using a normalized range of 0-1, a range of -0.5 to +7.4999 is 
supported. Values below 0 and above 1 are what enables scRGB its larger 
gamut, compared to sRGB, even though it has the same primary colors. 
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Serial Digital I/O. Another name for the 270 Mbps or 360 Mbps serial inter- 
face defined by BT.656. It is used primarily on professional and studio video 
equipment. 

Serial Data Transport Interface, defined by SMPTE 305M. 

See Standard-Definition Television. 

This is another color video format similar to PAL. The major differences 
between the two are that in SECAM the chroma is FM modulated and the R-Y 
and B-Y signals are transmitted line sequentially. SECAM stands for Sequen- 
tiel Couleur Avec Memoire or Sequential Color with Memory. 

Generally used to transmit audio in a second language. May also be used to 
transmit the [aural] description of key visual elements of a program, inserted 
into the natural pauses in the audio of the programming. 

These are pulses that occur during the vertical sync interval of NTSC, PAL 
and SECAM, at twice the normal horizontal scan rate. The reason these exist 
was to ensure correct 2:1 interlacing in early televisions and eliminate DC off- 
set buildup. 

Setup is the same thing as Pedestal. 

Standard (or Source) Input Format. This video format was developed to allow 
the storage and transmission of digital video. The 625/50 SIF format has a res- 
olution of 352 x 288 active pixels and a frame rate of 25 frames per second. The 
525/ 60 SIF format has a resolution of 352 x 240 active pixels and a frame rate 
of 30 frames per second. Note that MPEG-1 allows resolutions up to 4095 x 
4095 active pixels; however, there is a “constrained subset” of parameters 
defined as SIF. The computer industry, which uses square pixels, has defined 
SIF to be 320 x 240 active pixels, with a frame rate of whatever the computer is 
capable of supporting. 

Signal-to-noise ratio is the magnitude of the signal divided by the amount of 
unwanted stuff that is interfering with the signal (the noise) . SNR is usually 
described in decibels, or dB, for short; the bigger the number, the better look- 
ing the picture. 

Silent Radio is a service that feeds data that is often seen in hotels and night- 
clubs. It’s usually a large red sign that shows current news, events, scores, etc. 
It is present on NTSC lines 10-11 and 273-274, and uses encoding similar to 
CEA-608. 
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A technique where a VBI decoder samples the VBI data (such as teletext and 
captioning data) , locks to the timing information and converts it to binary 0’s 
and l’s. DC offsets, amplitude variations and ghosting must be compensated 
for by the VBI decoder to accurately recover the data. 

Defines the longitudinal (LTC) and vertical interval (VITC) timecode for 480i 
and 576i video systems. LTC requires an entire field time to store timecode 
information, using a separate track. VITC uses one scan line each field during 
the vertical blanking interval. 

720 x 480 pro-video interlaced standard (29.97 Hz) . Covers the digital repre- 
sentation and the digital parallel interface. Also see BT.601 and BT.656. 

NTSC video specification for the United States. See RS-170A and BT.470. 

1920 x 1035 pro-video interlaced standard (29.97 or 30 Hz) . Covers the analog 
RGB and YPbPr representation. The digital parallel interface is defined by 
SMPTE 260M. The digital serial interface is defined by SMPTE 292M. 

768 x 486 pro-video interlaced standard (29.97 Hz). Covers the digital repre- 
sentation (composite NTSC video sampled at 4x Fsc) and the digital parallel 
interface. The digital serial interface is defined by SMPTE 259M. 

Analog RGB video interface specification for pro-video SDTV systems. 

960 x 480 pro-video interlaced standard (29.97 Hz) . Covers the digital repre- 
sentation and the digital parallel interface. Also see BT.601 and BT.1302. 

1920 x 1080 pro-video interlaced and progressive standards (29.97, 30, 59.94, 
and 60 Hz). Covers the digital representation, the analog RGB and YPbPr 
interfaces and the digital parallel interface. The digital serial interface is 
defined by SMPTE 292M. 

720 x 480 pro-video progressive standards (59.94 Hz) . Covers the digital repre- 
sentation, the analog RGB and YPbPr interfaces and the digital parallel inter- 
face. The digital serial interface is defined by SMPTE 294M. Also see BT.1358 
and BT.1362. 

1280 x 720 pro-video progressive standards. Covers the digital representation 
and the analog RGB and YPbPr interfaces. The digital parallel interface uses 
SMPTE 274M. The digital serial interface is defined by SMPTE 292M. 
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SMPTE 314M 

SMTPE 344M 
SMPTE 348M 

SMPTE 370M 
SMPTE 421M 
SMPTE RP160 
SPDIF 

Split Sync 
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Serial data transport interface (SDTI) . This is a 270 or 360 Mbps serial inter- 
face based on BT.656 that can be used to transfer almost any type of digital 
data, including MPEG-2 program and transport bitstreams, DV bitstreams, 
etc. You cannot exchange material between devices that use different data 
types. Material that is created in one data type can only be transported to 
other devices that support the same data type. There are separate map docu- 
ments that format each data type into the 305M transport. 

Defines the data structure for DV audio, data and compressed video at 25 and 
50 Mbps. Also see BT.1618 and IEC 61834. 

Defines a 540 Mbps serial digital interface for pro-video applications. 

High data-rate serial data transport interface (HD-SDTI). This is a 1.485 Gbps 
serial interface based on SMPTE 292M that can be used to transfer almost any 
type of digital data, including MPEG-2 program and transport bitstreams, DV 
bitstreams, etc. You cannot exchange material between devices that use differ- 
ent data types. Material that is created in one data type can only be trans- 
ported to other devices that support the same data type. There are separate 
map documents that format each data type into the 348M transport. 

This SMPTE standard specifies a data structure for DV-based audio, data, and 
compressed video at data rates of 100 Mbps. Also see BT.1620. 

This SMPTE standard, originally known as VC-1, is based on the WMV9 video 
codec from Microsoft®. It is designed to compete with MPEG-4.10 (H.264), 

Analog RGB and YPbPr video interface specification for pro-video HDTV sys- 
tems. 

Abbreviation for Sony/Philips Digital InterFace. This is a consumer interface 
used to transfer either compressed or 2-channel LPCM digital audio. A serial, 
self-clocking scheme is used, based on a coax or fiber interconnect. IEC 60958 
now fully defines this interface for consumer and professional applications. 

Split sync is a video scrambling technique, usually used with either horizontal 
blanking inversion, active video inversion, or both. In split sync, the horizontal 
sync pulse is split, with the second half of the pulse at +100 IRE instead of the 
standard -40 IRE. Depending on the scrambling mode, either the entire hori- 
zontal blanking interval is inverted about the +30 IRE axis, the active video is 
inverted about the +30 IRE axis, both are inverted, or neither is inverted. By 
splitting the horizontal sync pulse, a reference of both -40 IRE and +100 IRE is 
available to the descrambler. 
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Since a portion of the horizontal sync is still at -40 IRE, some sync separa- 
tors may still lock on the shortened horizontal sync pulses. However, the tim- 
ing circuits that look for color burst a fixed interval after the end of horizontal 
sync may be confused. In addition, if the active video is inverted, some video 
information may fall below 0 IRE, possibly confusing sync detector circuits. 

The burst is always present at the correct frequency and timing; however, 
the phase is shifted 180 degrees when the horizontal blanking interval is 
inverted. 

When the ratio of active pixels per line to active lines per frame is the same as 
the display aspect ratio. This is the same as the sampling lattice having equal 
spatial horizontal and vertical spacing of the sampling points. 

The specification for sRGB (IEC 61966-2-1) uses BT.709 chromaticity, D65 
reference white, a display gamma of 2.2 and linear RGB (8 bits per color). 

sRGB values have a normalized range of 0-1, with 8-bit digital sRGB val- 
ues having a range of 0-255 for black-white. A version called “Studio RGB” 
uses an 8-bit range of 16-235 for black-white, enabling compatibility with 
video applications. 

One limitation of sRGB is that since the normalized values are restricted 
to the 0-1 range, colors outside the gamut (the triangle produced by them) 
cannot be used. For this reason, the extended RGB color space, “scRGB”, was 
developed. 

SDTV is content or a display capable of displaying a maximum of 576 inter- 
laced active scan lines. No aspect ratio is specified. 



Compressed audio and video that is transmitted over the Internet or other net- 
work in real time. It usually offers VCR-style remote control capabilities such 
as play, pause, fast-forward and reverse. 

A secondary signal containing additional information that is added to a main 
signal. 

Subsampled means that a signal has been sampled at a lower rate than some 
other signal in the system. A prime example of this is the 4:2:2 YCbCr color 
space. For every two luma (Y) samples, only one Cb and Cr sample is present. 
This means that the Cb and Cr signals are subsampled. 
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Subtitles 



Superblack 



Super 
VideoCD 
(Super VCD, 
SVCD) 



S-Video 



SVM 



Text that is added below or over a picture that usually reflects what is being 
said, possibly in another language. Open subtitles are transmitted as video 
that already has the subtitles present. Closed subtitles are transmitted during 
the VBI, and rely on the TV to decode them and position them below or over 
the picture. Closed captioning is a form of subtitling. Subtitling for DVB is 
specified in ETSI ETS 300 743. 

A keying signal that is embedded within a video signal as a level between 
black and sync. It is usually used to improve luma self-keying because the 
video signal contains black, making a good luma self-key hard to implement. 
When a downstream keyer detects the super black level, it inserts a second 
video signal. 

Defined by the China National Technical Committee of Standards on Record- 
ing, this CD standard holds 35-70 minutes of digital audio and video informa- 
tion. MPEG-2 video is used, with a resolution of 480 x 480 (29.97 Hz frame 
rate) or 480 x 576 (25 Hz frame rate). Audio uses MPEG layer 2 at a bit-rate of 
32-384 kbps, and supports four mono, two stereo, or 5.1 channels. Subtitles 
use overlays rather than subpictures (DVD-Video) or being encoded as video 
(VideoCD). Variable bit-rate encoding is used, with a maximum bit-rate of 2.6 
Mbps. IEC 62107 defines the Super VideoCD standard. 

XSVCD, although not an industry standard, increases the video resolution 
and bit-rate to improve the video quality over SVCD. MPEG-2 video is still 
used, with a resolution of 720 x 480 (29.97 Hz frame rate) or 720 x 576 (25 Hz 
frame rate). Variable bit-rate encoding is still used, with a maximum bit-rate of 
9.8 Mbps. 

Separate video, also called Y/C video. Separate luma (Y) and chroma (C) 
video signals are used, rather than a single composite video signal. By simply 
adding together the Y and C signals, you generate a composite video signal. 

A DC offset of +2.3v may be present on the C signal when a letterbox pic- 
ture format is present. A DC offset of +5v may be present to indicate when a 
16:9 anamorphic picture format is present. A standard 4:3 receiver ignores all 
DC offsets, thus displaying a typical letterboxed picture. 

The IEC 60933-5 standard specifies the S-video connector, including signal 
levels. 

See Velocity Scan Modulation. 
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Sync is a fundamental, you gotta have it, piece of information for displaying 
any type of video. Essentially, the sync signal tells the display where to put the 
picture. The horizontal sync, or HSYNC for short, tells the display where to 
put the picture in the left-to-right dimension. The vertical sync, or VSYNC for 
short, tells the display where to put the picture from top-to-bottom. 

Analog SDTV and EDTV signals use a bi-level sync, where the sync level 
is a known value below the blanking level. Analog HDTV signals use a tri-level 
sync, where the sync levels are known values above and below the blanking 
level. 

The reason analog HDTV signals use a tri-level sync signal is timing accu- 
racy. The horizontal timing reference point for a bi-level sync signal is defined 
as the 50% point of the leading edge of the horizontal sync pulse. In order to 
ascertain this point precisely, it is necessary to determine both the blanking 
level and sync-tip level and determine the mid-point value. If the signal is in 
any way distorted, this will reduce the timing accuracy. 

With a tri-level sync signal, the timing reference point is the rising edge of 
the sync signal as it passes through the blanking level. This point is much eas- 
ier to accurately determine, and can be implemented relatively easily. It is also 
more immune to signal distortion. 

A sync generator is a circuit that provides sync signals. A sync generator may 
have genlock capability. 

A sync noise gate is used to define an area within the analog video waveform 
where the video decoder is to look for the sync pulse. Anything outside of this 
defined window will be rejected. The main purpose of the sync noise gate is to 
make sure that the output of the video decoder is nice, clean and correct. 

An analog video signal contains video information, which is the picture to be 
displayed, and timing (sync) information that tells the receiver where to put 
this video information on the display. A sync stripper pulls out the sync infor- 
mation from the analog video signal and throws the rest away. 

Refers to two or more events that happen in a system or circuit at the same 
time. 

See Super VideoCD. 

A method of transmitting data with a video signal. ITU-R BT.653 lists the 
major teletext systems used around the world, while ETSI ETS 300 706 
defines in detail the teletext standard for PAL. North American Broadcast 
Teletext Specification (NABTS) is 525-line system C. 
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For digital transmissions such as HDTV and SDTV, the teletext characters 
are multiplexed as a separate stream along with the video and audio data. It is 
common practice actually to embed this stream in the MPEG video bitstream 
itself, rather than at the transport layer. Unfortunately there is no widespread 
standard for this teletext stream — each system (DSS, DVB, ATSC, DVD) has 
its own solution. 

This is what the Europeans call serrated sync. See Serration Pulses and Sync. 



Certain video sources have their sync signals screwed up. The most common 
of these sources is the VCR. A timebase corrector fixes a video signal that has 
bad sync timing. 

A sync signal that has three levels, and is commonly used for analog HDTV 
signals. See the definition for Sync. 

True color means that each sample of an image is individually represented 
using three color components, such as R G B' or YCbCr. 

When an image is displayed, it is underscanned if all of the image, including 
the top, bottom and side edges, are visible on the display. Underscan is com- 
mon in computer displays. 

The carrier used by Earth stations to transmit information to a satellite. 

See CEA-608. 

Variable bit-rate (VBR) means that a bitstream (compressed or uncom- 
pressed) has a changing number of bits each second. Simple scenes can be 
assigned a low bit-rate, with complex scenes using a higher bit-rate. This 
enables maintaining the audio and video quality at a more consistent level. 

See Vertical Blanking Interval. 

Abbreviation for Variable Bit-Rate. 

Original name of the SMPTE 421M video codec, based on the Microsoft® 
WMV9 video codec. 

Abbreviation for VideoCD. 
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Commonly used in CRT-based TVs to increase the apparent sharpness of a 
picture. At horizontal dark-to-light transitions, the beam scanning speed is 
momentarily increased approaching the transition, making the display rela- 
tively darker just before the transition. Upon passing into the lighter area, the 
beam speed is momentarily decreased, making the display relatively brighter 
just after the transition. The reverse occurs in passing from light to dark. 

During the vertical blanking interval, the video signal is at the blanking level 
so as not to display the electron beam when it sweeps back from the bottom to 
the top side of the CRT screen. 

Timecode information is stored on a scan line during each vertical blanking 
interval. 



See Resolution. 



For noninterlaced video, this is the same as the frame rate. For interlaced 
video, it is usually one-half the field rate. 

This is the portion of the analog video signal that tells the video decoder 
where the top of the picture is. 

A method of encoding digital data onto a carrier for RF transmission. 8-VSB is 
used for over-the-air broadcasting of ATSC HDTV in the United States. 

A specific frequency that is modulated with video data before being mixed 
with the audio data and transmitted. 

A digital video interface designed to simplify interfacing video ICs together. 
One portion is a digital video interface (based on BT.656) designed to simplify 
interfacing video ICs together. A second portion is a host processor interface. 
VIP is a VESA specification. 

Video mixing is taking two independent video sources (they must be gen- 
locked) and merging them together. See Alpha Mix. 

Converting a baseband video signal to an RF signal. 



Yet another digital video interface designed to simplify interfacing video ICs 
together. 
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Video-On- Video-On-demand, or VOD, allows users to select which program to view at 

Demand their convenience and playing starts almost immediately. When used over the 

Internet or other network, it is commonly called “streaming video.” For broad- 
cast, satellite, and cable networks, it is commonly called “pay-per-view” and is 
usually confined to specific start times. For this reason, it may also be referred 
to as “near video-on-demand” or NVOD. 

Video Program VPS is used in some countries instead of PDC to control VCRs. The data for- 

System (VPS) mat is the same as for PDC, except that it is transmitted on a dedicated line 

during the vertical blanking interval, usually line 16. 

VideoCD Compact discs that hold up to about an hour of digital audio and video infor- 

mation. MPEG-1 video is used, with a resolution of 352 x 240 (29.97 Hz frame 
rate) or 352 x 288 (25 Hz frame rate) . Audio uses MPEG layer 2 at a fixed bit- 
rate of 224 kbps, and supports two mono or one stereo channel (with optional 
Dolby® pro-logic). Fixed bit-rate encoding is used, with a bit-rate of 1.15 Mbps. 
The next generation, defined for the Chinese market, is Super VideoCD. 

XVCD, although not an industry standard, increases the video resolution 
and bit-rate to improve the video quality over VCD. MPEG-1 video is still used, 
with a resolution of up to 720 x 480 (29.97 Hz frame rate) or 720 x 576 (25 Hz 
frame rate). Fixed bit-rate encoding is still used, with a bit-rate of 3.5 Mbps. 

VIP See Video Interface Port. 

VITC See Vertical Interval Timecode. 

VMI See Video Module Interface. 

VOB DVD-Video movies are stored on the DVD using VOB files. They usually con- 

tain multiplexed Dolby® Digital audio and MPEG-2 video. VOB files are 
named as follows: vts_XX_Y.vob where XX represents the title and Y the part 
of the title. There can be 99 titles and 10 parts, although vts_XX_0.vob never 
contains video, usually just menu or navigational information. 

VOD See Video-On-Demand. 

VPS See Video Program System. 

VSB See Vestigial Sideband. 

VSM See Velocity Scan Modulation. 
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Y/C Video 
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Check out the Vertical Sync definition. 

This level defines what white is for the particular video system. 

WSS may be used on 576i line 23 and 480i lines 20 and 283 to specify the 
aspect ratio of the program and other information. 16:9 TVs may use this infor- 
mation to allow displaying of the program in the correct aspect ratio. ITU-R 
BT.1119 and ETSI EN 300 294 specify the WSS signal for 576i and 480i sys- 
tems. EIA-J CPR-1204 and IEC 61880 also specify another WSS signal for 480i 
systems. 

BT.653 525-line and 625-line system B teletext. 



See Wide Screen Signaling. 

See World System Teletext. 

Abbreviation for extended Super VideoCD. See Super VideoCD. 

Abbreviation for extended VideoCD. See VideoCD. 

The xvYCC (extended gamut YCbCr for video) color space extends the color 
gamut of normal YCbCr, enabling 1.8x more colors to be reproduced. The 
specification for xvYCC (IEC 61966-2-4) uses BT.709 chromaticity and D65 
reference white. The equations for converting between scR'G'B' and xvY- 
CbCr are the same as those used for converting between R G B' and YCbCr. 

xvYCC-based YCbCr data has an 8-bit range of 1-254, enabling backwards 
compatibility with existing designs. Y has an 8-bit range of -15/219 to +238/ 
219 (-0.068493 to +1.086758); CbCr has an 8-bit range of -15/224 to +238/224 
(-0.066964 to +1.062500). 

See S-video. 

AY/C separator is what’s used in a NTSC and PAL video decoder to separate 
the luma and chroma in a NTSC or PAL system. This is the first thing that any 
NTSC/PAL video decoder must do. The composite video signal is fed to a Y/ 
C separator so that the chroma can then be decoded further. 
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YCbCr 



YIQ 



YPbPr 



YUV 



YCbCr is the color space originally defined by BT.601, and now used for all 
digital component video formats. Y is the luma component and the Cb and Cr 
components are color difference signals. The technically correct notation is 
Y'Cb'Cr' since all three components are derived from R G B'. Most people 
use the YCbCr notation rather than Y CbCr or Y'Cb 'Cr '. 

4:4:4 YCbCr means that for every Y sample, there is one sample each of 
Cb and Cr. 

4:2:2 YCbCr means that for every two horizontal Y samples, there is one 
sample each of Cb and Cr. 

4:1:1 YCbCr means that for every four horizontal Y samples, there is one 
sample each of Cb and Cr. 

4:2:0 YCbCr means that for every block of 2 x 2 Y samples, there is one 
sample each of Cb and Cr. There are three variations of 4:2:0 YCbCr, with the 
difference being the position of Cb and Cr sampling relative to Y. 

Note that the coefficients to convert RGB' to YCbCr are different for 
SDTV and HDTV applications. 

YIQ is a color space optionally used by the NTSC video system. The Y compo- 
nent is the black-and-white portion of the image. The I and Q parts are the 
color difference components; these are effectively nothing more than color 
placed over the black and white, or luma, component. Many people use the 
YIQ notation rather than Y IQ or Y I Q'. The technically correct notation is 
Y I Q' since all three components are derived from R G B'. 

YPbPr is the analog version of the YCbCr color space, with specific levels and 
timing signals, designed to interface equipment together. Consumer video 
standards are defined by EIA-770; the professional video standards are 
defined by numerous SMPTE standards. VBI data formats for EIA-770 are 
defined by CEA-805. Many people use the YPbPr notation rather than Y PbPr 
or Y Pb Pr. The technically correct notation is Y Pb Pr' since all three compo- 
nents are derived from R'G 'B '. 

YUV is the color space used by the NTSC and PAL video systems. As with the 
YIQ color space, the Y is the luma component while the U and V are the color 
difference components. Many people use the YUV notation when they actually 
mean YCbCr data. Most use the YUV notation rather than Y UV or Y'U'V'. 
The technically correct notation is Y'U'V' since all three components are 
derived from R G B '. 

YUV is also the name for some component analog interfaces on consumer 
equipment. Some manufacturers incorrectly label it YCbCr. THX certification 
requires it to be labeled YPbPr. 
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Intel’s 4:1:0 YCbCr format. The picture is divided into blocks, with each block 
comprising 4x4 samples. For each block, sixteen 8-bit values of Y, one 8-bit 
value of Cb and one 8-bit value of Cr are assigned. The result is an average of 9 
bits per pixel. 

Intel’s notation for MPEG-1 4:2:0 YCbCr stored in memory in a planar format. 
The picture is divided into blocks, with each block comprising 2x2 samples. 
For each block, four 8-bit values of Y, one 8-bit value of Cb and one 8-bit value 
of Cr are assigned. The result is an average of 12 bits per pixel. 

Intel’s notation for 4:2:2 YCbCr format. 

See the definition for Creepy-Crawlies. 

Zoom is a type of image scaling. Zooming is making the picture larger so that 
you can see more detail. The examples described in the definition of scaling 
are also examples that could be used here. 

A technique of implementing stereo or dual-mono audio for NTSC and PAL 
video. One FM subcarrier transmits an L+R signal and a second FM subcar- 
rier transmits an R signal (for stereo) or a second L+R signal. It is discussed in 
BS.707, and is similar to the BTSC technique. 




Index 



Numerics 

10-step staircase test signal 325 
10T pulse 336 
12. 5T pulse 336 
1394 

see IEEE 1394 
20T pulse 336 

2- 2 pulldown 240 
24-1 pulldown 241 
25T pulse 336 
2CIF 846 

2D comb filter 451 
2T pulse 336 

3- 2 pulldown 240 
3-3 pulldown 241 
3D comb filter 458 
4:1:1 YCbCr 22 
4:2:0 YCbCr 22 
4:2:2 YCbCr 22 

4:4:4 to 4:2:2 conversion 195 
4:4:4 YCbCr 21 
4CIF 846 

A 

active format description 718 
adaptive comb filter 457 
adaptive contrast enhancement 202 
AFD718 
AIT 816 



alpha 204, 422, 458 
alpha channel 422, 458 
alpha mixing 204 
AMOL 381, 713 
analog component video 77, 90 
1080i 59 
1080p 59 

1125-line interlaced 59 
1125-line progressive 59 
1152i 59 

1250-line interlaced 59 
480i 39 
480p 41 

525-line interlaced 39 
525-line progressive 41 
576i 48 
576p 48 

625-line interlaced 48 
625-line progressive 48 
720p 56 

750-line progressive 56 
CGMS-A 82, 84, 87, 94, 96 
copy generation management system 
(CGMS-A) 82, 84, 87, 94, 96 
VBI data 82, 87, 94, 96 
YPbPr 77, 90 
ancillary data 108, 130 
anti-aliased resampling 224 
application information table 816 



889 



890 Index 



ARIB 812 
ARIB over IP 835 
ARIB STD-BIO 812 
ARIB STD-B16 812 
ARIB STD-B20 812 
ARIB STD-B21 812 
ARIB STD-B23 812 
ARIB STD-B24 812 
ARIB STD-B25 812 
ARIB STD-B31 812 
ARIB STD-B32 812 
ARIB STD-B40 812 
ARIB STD-B5 825 
arrival time stamp 662 
ATSC 764 
audio 766 

commentary 767 
complete main 766 
dialogue 767 
hearing impaired 766 
music and effects 766 
visually impaired 766 
voice-over 767 
A-VSB 765 
broadcast flag 697 
closed captioning 708 
data broadcasting 773 
descriptors 695, 770 
enhanced 8-vsb 772 
E-VSB 772 

program and system information protocol 
768, 772 
PSIP 768, 772 
SI tables 768 
video 766 
ATSC A/49 383 
ATSC A/52 764, 778 
ATSC A/53 764, 778 
ATSC A/57 764 
ATSC A/64 764 
ATSC A/65 764, 778 
ATSC A/70 764 



ATSC A/80 764 
ATSC A/81 764 
ATSC A/90 764, 778 
ATSC A/92 764 
ATSC A/93 764 
ATSC A/94 764 
ATSC A/95 764 
ATSC A/96 764 
ATSC A/97 764 
audio 

ATSC 766, 772 
BTSC 266 
DY 517 
DVB 798 
EIAJ268 
ISDB 814 
MPEG-1 541 
ASPEC 542 

background theory 542 
MUSICAM 541 
sound quality 541 
MPEG-2 578 
MPEG-4 739 
NICAM 728 289 
NTSC 266 
OpenCable 780 
PAL 289 
Zweiton 289 
audio service 
ATSC 

commentary 767 
complete main 766 
dialogue 767 
hearing impaired 766 
music and effects 766 
visually impaired 766 
voice-over 767 
automatic gain control 424 
AVCHD 536 
AVS 841 
A-VSB 765 




Index 891 



B 

B frame 481, 544 
B pictures 585 
B slice 760 
B VOP 741 

backward prediction 546, 587 
bandwidth-limited edge generation 416 
BAT 799, 816 
Betacam interface 100 
bidirectional frame 481, 544 
bidirectional pictures 585 
bidirectional prediction 546, 587 
bidirectional slice 760 
BIFS 751 

bilinear interpolation 224 
black burst 406 
black level control 198 
black stretch 202 
block 248 
H.264 759 
MPEG-1 546 
MPEG-2 587 
MPEG-4.10 759 
MPEG-4.2 741 
blue enhancement 202 
blue stretch 202 
bob and weave 243 
bouquet association table 799, 816 
Bresenham algorithm 224 
brightness control 198 
broadcast flag 697 
broadcaster information table 816 
Bruch blanking 414 

BS. 707 289 
BT 816 

BT. 1119 300, 369 
BT.1120 114, 128 
BT.1124 383 
BT.1197 300 
BT.1302 112, 128 
BT.1303 116 
BT.1358 39, 45, 53 
BT.1362 129 



BT.1381 143 

BT.1577 144 

BT.470 465 

BT.471 313 

BT.473 333 

BT.601 37, 41, 48 

BT.601 IC video interface 149 

BT.653 374 

BT.656 112, 128 

BT.656 IC video interface 156 

BT.709 39, 62, 64 

BT.799 116 

BT.809 378 

BTSC 266 

burst generation 402 

C 

cable virtual channel table 768 
CAT 672 
CDT 816 
CEA-608 346, 706 
CEA-708 706 
CEA-805 82, 84, 94, 96 
CEA-861 167 
CENELEC EN50049 69 
CGMS-A 82, 84, 87, 94, 96, 362, 369, 372 
chroma keying 214 
chroma spill 215 
garbage matte 217 
linear keying 214 
luminance modulation 215 
shadow chroma keying 215 
wide keying 221 
chroma spill 215 
chromaticity coordinates 31 
chromaticity diagram 28 
chrominance (analog) 402 
chrominance bars 
EBU 403 
EIA403 

chrominance demodulation 425 
chrominance demodulator 425 
chrominance nonlinear gain 419, 464 
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chrominance nonlinear phase 419, 464 
chrominance-luminance intermodulation 419, 
464 
CIF 846 

clean encoding 415 
closed captioning 368, 706 
ATSC 708 

captioning channels 368 
CC1 channel 368 
CC2 channel 368 
CC3 channel 368 
CC4 channel 368 
CEA-608 346 
CEA-708 706 
ETSI EN301775 712 
Europe 368 
H.264 712 
ISDB 825 
MPEG-2 708 
MPEG-4.10 712 
OpenCable 710 
PAL 368 
SCTE 20 710 
SCTE 21 710 
SMPTE 421M 712 
T1 channel 368 
T2 channel 368 
T3 channel 368 
T4 channel 368 
text channels 368 
VBI standard 712 
VC-1 712 

closed GOP 545, 585 
color bars 312 

100% HDTV YPbPr 93 
100% SDTV YPbPr 80 
75% HDTV YPbPr 93 
75% SDTV YPbPr 80 
HSI30 
HSV29 
NTSC 316 
PAL 319 



RGB 16 
YCbCr 19 

color burst detection 433 
color control 198 

color saturation accuracy 420, 465 
color space 
HLS 27 
HSI27 
HSV27 
PhotoYCC 26 
RGB 15 
scRGB 17, 26 
sRGB 16 

studio RGB 19, 20, 21 
xvYCC 26 
YCbCr 19 

4:4:4 to 4:2:2 conversion 195 
YIQ 18 
YUV17 
YUV12 22 

color space equations 

constant luminance issue 36 
constant luminance principle 33 
conversion considerations 32 
general 

RGB to YCbCr 19, 21 
RGB to YIQ 18 
RGB to YUV 18 
scRGB to sRGB 17 
YCbCr to RGB 19, 21 
YIQ to RGB 18 
YUV to RGB 18 
HDTV to SDTV YCbCr 194 
overflow handling 33 
PhotoYCC to RGB 27 
RGB to PhotoYCC 27 
SDTV to HDTV YCbCr 194 
color temperature 203 
color transient improvement 200 
comb filter 451 
comb filtering 451 
combination test signal 333 
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common data table 816 
complementary filtering 447 
component video 
analog 

see analog component video 
digital 

see digital component video 
composite chroma keying 222 
composite test signal 330 
conditional access 
DVB 808 
OpenCable 791 

conditional access table (CAT) 672 
constant luminance problem 36 
constrained image 77, 97 
constrained parameters bitstream (CPB) 542, 
578 

content description data 605 
contrast control 198 

copy generation management system (CGMS- 
A) 82, 84, 87, 94, 96, 362, 369, 372 
coring 200 
CPB 542, 578 
cross-color 446 
crossfading 204, 246 
cross-luminance 446 
CVCT768 

D 

D frame 550 
data broadcasting 
ATSC 773 
DVB 808 
ISDB 825 
MPEG-2 727 
OpenCable 790 
data event table 773 
data partitioning 584 
data service table 774 
DCAS 791 
DCCSCT 770 
DCCT770 
DCIF 846 



DCT248, 816 
DV 522 
H.264 762 
MPEG-1 546, 548 
MPEG-2 587 
MPEG-4.2 741 

decode time stamp (DTS) 732 
deinterlacing 243 
bob 243 

field merging 245 
fi lm mode 247 

fractional ratio interpolation 245 
inverse telecine 247 
motion adaptive 246 
motion-compensated 246 
scan line duplication 243 
scan line interpolation 243 
variable interpolation 245 
descriptor 

ARIB 692, 817 

audio component 817 
AVC timing and HRD 817 
AVC video 817 
basic local event 821 
board information 821 
bouquet name 821 
broadcaster name 821 
CA contract information 821 
CAEMMTS821 
CA identifier 821 
CA service 821 

carousel compatible composite 692, 
821 

component 692, 821 
component group 821 
conditional playback 692, 821 
connected transmission 821 
content 821 

content availability 692, 821 
country availability 692, 822 
data component 693, 822 
data content 822 
digital copy control 693, 822 
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download content 822 

emergency information 693, 822 

event group 822 

extended broadcaster 822 

extended event 822 

hierarchical transmission 694, 822 

hyperlink 822 

LDT linkage 822 

linkage 694, 823 

local time offset 823 

logo transmission 823 

mosaic 694, 823 

network identification 823 

network name 823 

node relation 823 

NVOD reference 823 

parental rating 694, 823 

partial reception 823 

partial transport stream 823 

partial transport stream time 823 

reference 823 

satellite delivery system 824 

series 824 

service 824 

service list 824 

short event 824 

short node information 824 

SI parameter 824 

SI prime TS 824 

STC reference 824 

stream identifier 694, 824 

stuffing 824 

system management 694, 824 
target region 694, 824 
terrestrial delivery system 824 
time-shifted event 824 
time-shifted service 825 
TS information 825 
video decode control 694, 825 



ATSC 695, 770 

AC-3 audio stream 695, 771 
ATSC CA 771 

ATSC private information 695, 771 
component name 696, 771 
content advisory 696, 771 
content identifier 771 
DCC arriving request 771 
DCC departing request 771 
enhanced signaling 697, 771 
extended channel name 771 
genre 771 

MAC address list 706 
redistribution control 697, 771 
service location 772 
SRM reference 772 
time-shifted service 772 
DVB 698, 804 
AAC 698, 804 
AC-3 698, 804 

adaptation field data 699, 804 
ancillary data 699, 804 
announcement support 804 
bouquet name 804 
CA identifier 804 
cable delivery system 804 
cell frequency link 805 
cell list 805 
component 699, 805 
content 805 

country availability 700, 805 
data broadcast 805 
data broadcast ID 700, 805 
DSNG 805 
DTS audio 700, 805 
E-AC-3 698, 804 
enhanced AC-3 698, 804 
extended event 805 
extension 701, 805 
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frequency list 805 
linkage 805 
local time offset 805 
mosaic 701, 806 
multilingual bouquet name 806 
multilingual component 806 
multilingual network name 806 
multilingual service name 806 
network name 806 
NVOD 806 

parental rating 701, 806 
partial transport stream 806 
PDC 806 

private data specifier 701, 806 
satellite delivery system 806 
scrambling 701, 806 
service 806 

service availability 807 
service list 807 
service move 702, 807 
short event 807 
short smoothing buffer 807 
stream identifier 702, 807 
stuffing 807 
subtitling 702, 807 
telephone 807 
teletext 703, 807 
terrestrial delivery system 807 
time-shifted event 807 
time-shifted service 807 
transport stream 807 
VBI data 703, 807 
VBI teletext 704, 808 
ISDB 

see ARIB descriptors 
MPEG-2 675 

AAC audio (MPEG-2) 675, 698 
audio stream 676 
AVC timing and HRD 676 
AVC video 677 
CA (conditional access) 678 
caption service 678 
content labeling 684 



copyright 679 

data stream alignment 680 

DTCP 680 

DTS audio 681 

hierarchy 681 

IBP 682 

IPMP 683 

language 683 

maximum bitrate 683 

metadata 684 

metadata pointer 684 

metadata STD 684 

multiplex buffer utilization 684 

private data 685 

registration 685 

smoothing buffer 685 

STD 686 

system clock 686 

target background grid 687 

video stream 687 

video window 688 

MPEG-4 689 

external ES ID 689 
FMC 689 
fmxbuffersize 690 
IOD 690 

MPEG-4 audio 689 
MPEG-4 video 692 
multiplexbuffer 691 
muxcode 691 
SL691 

OpenCable 704, 784 

AC-3 audio stream 704, 784, 788 
ATSC private information 784 
channel properties 788 
component name 785, 788 
component name (ATSC) 704 
component name (SCTE) 704 
content advisory 705, 785, 788 
daylight savings time 789 
DCC arriving request 785 
DCC departing request 785 
extended channel name 785, 789 
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extended video 705 
frame rate 705 
MAC address list 785 
redistribution control 706, 785 
revision detection 789 
service location 785 
time-shifted service 785, 789 
two-part channel number 789 
VBI data 706 
DET773 
DFP 168 

differential gain 419, 461 
differential luminance 419, 461 
differential phase 417, 461 
digital camera specification 189 
digital component video 37 
1080i 62 
1080p 64 

1125-line interlaced 62 
1125-line progressive 64 
4:4:4 to 4:2:2 YCbCr 195 
480i 41 
480p 45 

525-line interlaced 41 
525-line progressive 45 
576i 48 
576p 53 

625-line interlaced 48 
625-line progressive 53 
720p 56 

750-line progressive 56 
ancillary data 108 
coding ranges 37 
EAV timing 108 
F timing 108 
filtering 
CbCr 198 
Y 195 

H timing 108 
SAV timing 108 
V timing 108 
YCbCr 19 



digital composite video 129 

25 Hz offset compensation 140 
ancillary data 130 
SCH phase 129, 130 
TRS-ID 140 
video levels 130 
zero SCH phase 129, 130 
digital flatpanel interface (DFP) 168 
digital rights management (DRM) 674, 755, 835 
digital transmission content protection (DTCP) 
179 

digital visual interface (DVI) 162 
directed channel change selection code table 
770 

directed channel change table 770 
discontinuity information table 799, 816 
discrete cosine transform (DCT) 

DY 522 
H.264 762 
MPEG-1 546, 548 
MPEG-2 587 
MPEG-4.10 762 
MPEG-4.2 741 

display enhancement processing 198 
black level 198 
blue stretch 202 
brightness 198 
color 198 

color temperature correction 203 
color transient improvement 200 
contrast 198 
dynamic contrast 202 
green enhancement 202 
hue 198 

luma transient improvement 200 
saturation 198 
sharpness 200 
skin tone correction 432 
tint 198 
white level 198 
DIT 799, 816 
DLT816 
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DMIF 754 
DOCSIS 792 

download control table 816 
download table 816 

downloadable conditional access system 

(DCAS) 791 

DRM 674, 755, 835 

drop frame 338 

DST774 

DTCP 179 

D-terminal video interface 97 

DTS732 

DV 515 

100 Mbps 536 
50 Mbps 535 
AAUX518 
audio 517 

audio auxiliary data 518 
compression 522 
DCT522 

digital interface 534 

discrete cosine transform (DCT) 522 

IEC 61834 515 

IEEE 1394 535 

macroblocks 522 

SDTI 535 

SMPTE 221M 535 

SMPTE 222M 535 

SMPTE 314M 515 

SMPTE 370M 536 

super block 522 

VAUX 523 

video 521 

video auxiliary data 523 
DVB 796 

common scrambling algorithm (DVB-CSA) 
809 

conditional access 808 
CSA 809 

data broadcasting 808 
descriptors 698, 804 
DVB-IP 835 
DVB-IPI 835 



EIA-679 809 
ISO 7816 809 
multicrypt 809 
NRSS-A 809 
NRSS-B 809 
service information 798 
SI tables 798 
simulcrypt 809 
subtitles 724 
teletext 717 
video 798 

DVB common interface (DVB-CI) 809 

DVB over IP (DVB-IPI) 835 

DVB-C 798 

DVB-C2 798 

DVB-CI 809 

DVB-CI2 809 

DVB-CSA 809 

DVB-CSA2 809 

DVB-H 798 

DVB-IP 835 

DVB-IPI 835 

DVB-S 798 

DVB-S2 798 

DVB-SH 798 

DVB-T 798 

DVB-T2 798 

DVI 162 

dynamic contrast 202 
dynamic rounding 193 

E 

EBU color bars 322 
EBU N10 100 
ECM 672 
El frame 481 
EIA color bars 317 
EIA-189-A317 
EIA-516 374 
EIA-679 791, 809 
EIA-770.1 100 
EIA-770.2 100 
EIA-770.3 100 
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EIAJ CP-4120 97 


F 


EIAJ CPR-1204 82, 94, 372 


F timing 108 


EIAJ RC-5237 97 


field identification 414, 445 


EIT 798, 816 


NTSC 264 


elementary bitstream 584 


PAL 286 


EMM 672 


SECAM 306 


enhanced 8-vsb 772 


field prediction 642 


enhanced television programming 725 


field square wave test signal 330 


entitlement control messages (ECM) 672 


fixed colorplus 302 


entitlement management messages (EMM) 


FlexMux 755 


672 


forward prediction 546, 587 


EP frame 481 


fractional ratio interpolation 245 


error feedback rounding 193 


frame dropping 232 


ERT 816 


frame duplication 232 


ETSI EN300163 292 


frame prediction 642 


ETSI EN300231 378 


frame rate conversion 227 


ETSI EN300294 300 


frame dropping 232 


ETSI EN300421 796 


frame duplication 232 


ETSI EN300429 796 


motion compensation 234 


ETSI EN300468 796 
ETSI EN300472 796 


temporal interpolation 234 


ETSI EN300706 374 


G 


ETSI EN300743 724, 796 


gamma 


ETSI EN300744 796 


HDTV 35 


ETSI EN301192 796 


NTSC 35 


ETSI EN301775 712, 796 


PAL 36 


ETSI EN302304 796 


SECAM 36 


ETSI EN302307 796 


garbage matte 217 


ETSI EN 50221 797 


genlocking 421, 436 


ETSI ES200800 797 


ghost cancellation 383 


ETSI ETS300731 300 


gigabit video interface 172 


ETSI ETS300732 301 


graphics overlay 204 


ETSI ETS300801 797 


green enhancement 202 


ETSI ETS300802 797 
ETT 768, 773 


GVIF 172 


euroconnector 69 


H 


event information table 798, 816 


H tilt 421, 465 


event relation table 816 
E-VSB 772 

extended text table 768, 773 


H timing 108 
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H.261 466 

block layer 476 
coding control 471 
coding layer 466 
forced updating 472 
GOB layer 473 
group of blocks layer 473 
I frame 466 
inter-frame 466 
intra-frame 466 
loop filter 469 
macroblock layer 474 
motion compensation 469 
P frame 466 
picture layer 472 
predicted frame 466 
prediction 469 
quantization 471 
video bitstream 472 
block layer 476 
GOB layer 473 
group of blocks layer 473 
macroblock layer 474 
picture layer 472 
zig-zag scan 476 
H.263 481 

additional supplemental enhancement in- 
formation specification 512 
advanced intra-coding mode 507 
advanced prediction mode 507 
alternative inter-VLC mode 511 
B frame 481 
bidirectional frame 481 
block layer 494 
coding control 482 
coding layer 482 

continuous presence multipoint 507 
data-partitioned slice mode 511 
deblocking filter mode 508 
El frame 481 

enhanced reference picture selection mode 
511 



EP frame 481 

forced updating 484 

forward error correction mode 507 

GOB layer 487 

group of blocks layer 487 

I frame 481 

improved PB frame 481 

improved PB frames mode 509 

independent segment decoding mode 511 

inter-frame 481 

intra-frame 481 

macroblock layer 488 

modified quantization mode 511 

motion compensation 482 

P frame 481 

PB frame 481 

PB frames mode 507 

picture layer 484 

plustype picture layer option 501 

predicted frame 481 

prediction 482 

quantization 482 

reduced resolution update mode 510 
reference picture resampling mode 510 
reference picture selection mode 509 
slice structured mode 508 
SNR scalability mode 509 
spatial scalability mode 509 
supplemental enhancement information 
508 

syntax-based arithmetic coding mode 506 
temporal scalability mode 509 
unrestricted motion vector mode 505 
video bitstream 484 
block layer 494 
GOB layer 487 
group of blocks layer 487 
macroblock layer 488 
picture layer 484 
zig-zag scan 508 
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H.264 738 

SEI messages 758 

Supplemental Enhancement Information 
(SEI) messages 758 

H.264 video over MPEG-2 transport stream 674 

hanging dots 458 

Hanover bars 428 

HAVi 181 

HD-CIF 846 

HDMI 167 

HD-SDTI 144 

HDV 536 

high-data-rate serial data transport interface 
144 

high-definition multimedia interface (HDMI) 
167 
HLS 27 

house sync 406 
HSI27 

color bars 30 
color space 27 
HSV27 

color bars 29 
color space 27 
hue accuracy 420, 464 
hue control 198 

I 

I frame 466, 481, 543 
I pictures 585 
I slice 760 
IVOP 741 
ICIT 673 
IEC 60933 69 
IEC 61834 515 
IEC 61880 82, 372 
IEC 61883 181 
IEC 62375 87 



IEEE 1394 174 

asynchronous data 178 
bus manager nodes 176 
cycle master nodes 176 
digital camera specification 189 
digital transmission content protection 
(DTCP) 179 
DTCP 179 
DV transfers 181 
endian issues 175 
HAVi 181 
IEC 61883 181 
isochronous data 178 
isochronous nodes 176 
link layer 178 
network topology 175 
node ports 176 
node types 176 
OHCI 181 

open host controller interface (OHCI) 181 
physical layer 177 
SBP-2 181 

serial bus protocol 181 
specifications 175 
transaction nodes 176 
improved PB frame 481 
index transmission table 816 
INT 799 

intellectual property management and protec- 
tion (IPMP) 674, 751, 755 
inter-field Y/C separation 458 
inter-frame 466, 481, 544 
interlaced-to-noninterlaced conversion 243 
inter-picture 585 
intra-field Y/C separation 451 
intra-frame 466, 481, 543 
intra-picture 585 
intra-slice 760 
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inverse telecine 247 
IP video 827 

IP/MAC notification table 799 
IPMP 674, 751, 755 

IPMP control information table (ICIT) 673 
IPTV827 

broadcast 828 
conditional access 835 
DRM 835 
multicast 828 
unicast 828 
ISDB 812 

ARIB STD-BIO 812 
ARIB STD-B16 812 
ARIB STD-B20 812 
ARIB STD-B21 812 
ARIB STD-B23 812 
ARIB STD-B24 812 
ARIB STD-B25 812 
ARIB STD-B31 812 
ARIB STD-B32 812 
ARIB STD-B40 812 
ARIB STD-B5 825 
audio compression 814 
closed captioning 825 
data broadcasting 825 
data carousel 825 
data piping 825 
event message 825 
independent PES 825 
interaction channel 826 
descriptors 817 
graphics 814 

service information (SI) 816 
SI tables 816 
still pictures 814 
video compression 814 
ISDB-C 813 
ISDB-S 813 
ISDB-T 814 
ISMA 834 
ISO 7816 809 
ISO/IEC 10918 539 



ISO/IEC 11172 539 
ISO/IEC 13818 577 
ISO/IEC 14496 738 
ITT 816 

ITU multiburst test signal 328 
ITU-R BS.707 289 
ITU-RBT.1119 300, 369 
ITU-R BT. 1120 114, 128 
ITU-R BT. 1124 383 
ITU-R BT. 1197 300 
ITU-R BT. 1302 112, 128 
ITU-R BT. 1303 116 
ITU-R BT. 1358 39, 45,53 
ITU-R BT. 1362 129 
ITU-R BT. 1381 143 
ITU-R BT. 1577 144 
ITU-R BT.470 465 
ITU-R BT.471 313 
ITU-R BT.473 333 
ITU-R BT.601 37,41,48 
ITU-R BT.653 374 
ITU-R BT.656 112, 128 
ITU-R BT. 709 39, 62, 64 
ITU-R BT. 799 116 
ITU-R BT.809 378 

J 

jam sync 338 

K 

keying 211 
chroma 214 

composite chroma keying 222 
luma 222 
luminance 211 
superblack 222 

L 

LDT 816 

line bar test signal 328 
linear interpolation 224 
linear keying 214 
linked description table 816 
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LIT 816 

local event information table 816 
long-term service table 773 
LTST773 
luma keying 222 
luminance keying 211 
luminance modulation 215 
luminance nonlinearity 419, 461 

M 

macroblock 248 
DY 522 
H.264 759 
MPEG-1 546 
MPEG-2 587 
MPEG-4.10 759 
MPEG-4.2 741 
master guide table 768 
meander gate 414 
MGT768 
Mil interface 100 
mixing 204 
MJPEG 540 

modulated pedestal test signal 326 
modulated ramp test signal 325 
modulated staircase test signal 326 
motion adaptive colorplus 303 
motion adaptive deinterlacing 246 
motion adaptive Y/C separation 458 
motion compensation 545, 586, 642 
frame rate conversion 234 
motion JPEG 540 

motion-compensated deinterlacing 246 
MPEG-1 539 
audio 541 

background theory 542 
sound quality 541 
B frame 544 

backward prediction 546 
bidirectional frame 544 
bidirectional prediction 546 
block 546 
block layer 562 



closed GOP 545 

coded frame types 543 

constrained parameters bitstream 542 

CPB 542 

D frame 550 

DCT 546 

decode postprocessing 575 
decoding video 575 
encode preprocessing 543 
fast playback 575 
forward prediction 546 
group of pictures (GOP) 544 
group of pictures layer 555 
I frame 543 
inter-frame 544 
interlaced video 543 
intra-frame 543 
ISO/IEC 11172 539 
ISO/IEC 11172 layer 570 
macroblock 546 
macroblock layer 558 
motion compensation 545 
open GOP 545 
P frame 544 
pack layer 570 
packet layer 573 
pause mode 575 
picture layer 556 
postprocessing 575 
predicted frame 544 
preprocessing 543 
quality issues 540 
quantizing 547 
reverse playback 575 
sequence header 551 
slice layer 557 
system bitstream 570 

ISO/IEC 11172 layer 570 
pack layer 570 
packet layer 573 
system header 571 
system header 571 
timecode 576 
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variable bit-rate 576 
video bitstream 551 
block layer 562 
group of pictures layer 555 
macroblock layer 558 
picture layer 556 
sequence header 551 
slice layer 557 
video sequence 551 
video sequence 551 
zig-zag scan 547 
MPEG-2 577 

4:2:2 profile 578 

active format description 718 

AFD718 

audio 578 

audio/video synchronization 732 
B pictures 585 
backward prediction 587 
bidirectional pictures 585 
bidirectional prediction 587 
bit-rates 583 
block layer 622 

camera parameters extension 620 
CAT 672 

closed captioning 706 
closed GOP 585 

conditional access table (CAT) 672 
constrained parameters bitstream (CPB) 
578 

content description data 605 
copyright extension 619 
CPB 578 

data broadcasting 727 
decode time stamp (DTS) 732 
decoding video 732 
descriptors 675 

digital rights management (DRM) 674 
DRM 674 
DTS 732 
ECM 672 

elementary bitstream 584 
EMM 672 



entitlement control messages (ECM) 672 
entitlement management messages 
(EMM) 672 
field prediction 642 
forward prediction 587 
frame prediction 642 
GOP layer 603 
group of pictures (GOP) 585 
group of pictures layer 603 
high level 578 
I pictures 585 
ICIT 673 
inter-picture 585 
intra-picture 585 
IPMP 674 

IPMP control information table (ICIT) 673 
ISO/IEC 13818 577 
ITU-T ext D extension 620 
levels 578 

high 1440 level 578 
high level 578 
low level 578 
main level 578 
low level 578 
macroblock 587 
macroblock layer 621 
main level 578 
main profile 578 
motion compensation 586, 642 
field prediction 642 
frame prediction 642 
multiview profile 578 
network information table (NIT) 673 
NIT 673 
open GOP 585 
P pictures 585 
pack layer 657 

packet identification code (PID) 666 
packet layer 661 
PAT 668 
PCR 735 

picture coding extension 611 
picture display extension 616 
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picture layer 604 

picture spatial scalable extension 618 
PID 666 
PMT 670 

predicted pictures 585 
presentation time stamp (PTS) 732 
profiles 

4:2:2 profile 578 
main profile 578 
multiview profile 578 
simple profile 578 
SNR profile 584 
spatial profile 584 
studio profile 578 

program association table (PAT) 668 
program clock reference (PCR) 735 
program map table (PMT) 670 
program stream 
pack layer 657 

program stream directory 661 
program stream map 659 
system header 657 
program stream directory 661 
program stream map 659 
PTS 732 

quant matrix extension 614 
quantizing 589 
scalability 584 

data partitioning 584 
SNR scalability 584 
spatial scalability 584 
temporal scalability 584 
sequence display extension 598 
sequence extension 596 
sequence header 593 
sequence scalable extension 601 
simple profile 578 
slice layer 620 
SNR profile 584 
spatial profile 584 
studio profile 578 
subtitles 720 
system header 657 



teletext 717 
timestamps 732 
transport stream 661 
H.264 video 674 
MPEG-4.10 video 674 
MPEG-4.2 video 674 
packet layer 661 
SMPTE 421M video 675 
VC-1 video 675 

transport stream description table (TSDT) 
671 

TSDT 671 
user data 596 
video bitstream 591 
block layer 622 

camera parameters extension 620 
copyright extension 619 
GOP layer 603 
group of pictures layer 603 
ITU-T ext D extension 620 
macroblock layer 621 
picture coding extension 611 
picture display extension 616 
picture layer 604 

picture spatial scalable extension 618 
quant matrix extension 614 
sequence display extension 598 
sequence extension 596 
sequence header 593 
sequence scalable extension 601 
slice layer 620 
user data 596 
video sequence 593 
video sequence 593 
zig-zag scan 589 
MPEG-2.5 578 
MPEG-4 738, 747 

audio compression 739 
B slice 760 

bidirectional slice 760 
BIFS 751 
descriptors 689 
DMIF 754 
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DRM 755 
FlexMux 755 
GOY 749 
graphics 747 

group of video object plane (GOV) 749 

I slice 760 

intra-slice 760 

IPMP 751, 755 

ISO/IEC 14496 738 

object description framework 749 

P slice 760 

predicted slice 760 

scene description 751 

SI slice 760 

SL-packetized stream 753 

SP slice 760 

sync layer 753 

video compression 741 

video object 747 

video object layer 747 

video object plane 749 

visual object plane (VOP) 741, 749 

visual object sequence 747 

visual profiles (natural) 743 

VOP 741, 749 

MPEG-4.10 video over MPEG-2 transport 
stream 674 

MPEG-4.2 video over MPEG-2 transport 
stream 674 

multiburst test signal 328 
multicast 828 
multicrypt 809 
multipulse test signal 328 

N 

NABTS 374 
NBIT817 

network board information table 817 
network information table 673, 799, 817 
network resources table 774 
NICAM 728 289 
NIT 673, 799,817 
noncomplementary filtering 447 



noninterlaced 
NTSC 266 
PAL 289 

noninterlaced-to-interlaced conversion 241 
NRSS-A 809 
NRSS-B 791, 809 
NRT774 

NTC-7 combination test signal 333 
NTC-7 composite video test signal 330 
NTSC 257 

4-field sequence 264 
channel assignments 268 
closed captioning 346 
formats 265 
noninterlaced 266 
overview 257 
RF modulation 265 
teletext 374 
timecode 337 
VBI data 337 
widescreen signaling 369 
WSS 

see widescreen signaling 
NTSC decoding 422 

10-step staircase test signal 325 

12. 5T pulse 336 

25T pulse 336 

2D comb filter 451 

2T pulse 336 

3D comb filter 458 

adaptive comb filter 457 

alpha 458 

alpha channel 458 

auto detect 446 

automatic gain control 424 

BT.470 465 

chrominance demodulation 425 
chrominance nonlinear gain 464 
chrominance nonlinear phase 464 
chrominance-luminance intermodulation 
464 

color bars test signal 312 
color burst detection 433 
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color saturation accuracy 465 
comb filter 451 
combination test signal 333 
complementary filtering 447 
composite test signal 330 
composite video digitizing 422 
cross-color 446 
cross-luminance 446 
deinterlacing 243 
differential gain 461 
differential phase 461 
EIA color bars 317 
field identification 445 
field signal 445 

field square wave test signal 330 
filtering 428 
gamma 35 
genlocking 436 
H tilt 465 

horizontal blanking 444 

horizontal sync 444 

hue accuracy 464 

hue adjustment 432 

inter-field comb filter 458 

interlaced-to-noninterlaced conversion 243 

intra-field comb filter 451 

ITU-R BT.470 465 

line bar test signal 328 

luminance nonlinearity 461 

modulated pedestal test signal 326 

modulated ramp test signal 325 

modulated staircase test signal 326 

motion adaptive Y/C separation 458 

multiburst test signal 328 

multipulse test signal 328 

NTC-7 combination test signal 333 

NTC-7 composite test signal 330 

PLUGE test signal 323 

progressive scan conversion 243 

red field test signal 325 

reverse blue bars test signal 322 

SMPTE bars test signal 322 

subcarrier generation 441 



subcarrier locking 441 
S-video connector 69 
T pulse 336 
V tilt 465 

vertical blanking 445 
vertical sync 444 
video parameters 461 

chrominance nonlinear gain 464 
chrominance nonlinear phase 464 
chrominance-luminance intermo dula- 
tion 464 

color saturation accuracy 465 
differential gain 461 
differential luminance 461 
differential phase 461 
H tilt 465 
hue accuracy 464 
luminance nonlinearity 461 
V tilt 465 

video test signals 312 
10-step staircase 325 
12. 5T pulse 336 
25T pulse 336 
2T pulse 336 
color bars 312 
combination 333 
composite 330 
EIA color bars 317 
field square wave 330 
line bar 328 

modulated pedestal 326 
modulated ramp 325 
modulated staircase 326 
multiburst 328 
multipulse 328 
NTC-7 composite 330 
PLUGE 323 
red field 325 
reverse blue bars 322 
SMPTE bars 322 
T pulse 336 
Ybars 324 
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video timing generation 444 
Y bars test signal 324 
Y/C connector 69 
Y/C separation 446 
2D comb filter 451 
3D comb filter 458 
adaptive comb filter 457 
comb filter 451 
complementary filtering 447 
inter-field comb filter 458 
intra-field comb filter 451 
motion adaptive 458 
noncomplementary filtering 447 
simple 447 
NTSC encoding 389 

10-step staircase test signal 325 

12. 5T pulse 336 

25T pulse 336 

2T pulse 336 

alpha 422 

alpha channel 422 

bandwidth-limited edge generation 416 
black burst 406 
BT.470 465 
burst generation 402 
chrominance frequency spectra 398 
chrominance nonlinear gain 419 
chrominance nonlinear phase 419 
chrominance-luminance intermodulation 
419 

clean encoding 415 
color bars test signal 312 
color saturation accuracy 420 
combination test signal 333 
composite test signal 330 
composite video generation 404 
differential gain 419 
differential luminance 419 
differential phase 417 
EIA chrominance color bars 403 
EIA color bars 317 
field identification 414 
field square wave test signal 330 



filtering 396 
gamma 35 
genlocking 421 
H tilt 421 

horizontal timing 411 

house sync 406 

hue accuracy 420 

ITU-R BT.470 465 

line bar test signal 328 

luminance generation 393 

luminance nonlinearity 419 

modulated pedestal test signal 326 

modulated ramp test signal 325 

modulated staircase test signal 326 

multiburst test signal 328 

multipulse test signal 328 

noninterlaced-to-interlaced conversion 241 

NTC-7 combination test signal 333 

NTC-7 composite test signal 330 

PLUGE test signal 323 

red field test signal 325 

residual subcarrier 420 

reverse blue bars test signal 322 

SCH phase 420 

SMPTE bars test signal 322 

subcarrier generation 407 

S-video connector 69 

S-video output skew 421 

T pulse 336 

V tilt 421 

vertical timing 411 
video levels 

chrominance (C) 403 
composite 404 
luminance (Y) 395 
video parameters 417 

chrominance nonlinear gain 419 
chrominance nonlinear phase 419 
chrominance-luminance intermodula- 
tion 419 

color saturation accuracy 420 
differential gain 419 
differential luminance 419 
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differential phase 417 
H tilt 421 
hue accuracy 420 
luminance nonlinearity 419 
residual subcarrier 420 
SCH phase 420 
Y tilt 421 

Y/C output skew 421 
video test signals 312 
10-step staircase 325 
12. 5T pulse 336 
25T pulse 336 
2T pulse 336 
color bars 312 
combination 333 
composite 330 
EIA color bars 317 
field square wave 330 
line bar 328 

modulated pedestal 326 
modulated ramp 325 
modulated staircase 326 
multiburst 328 
multipulse 328 
NTC-7 composite 330 
PLUGE 323 
red field 325 
reverse blue bars 322 
SMPTE bars 322 
T pulse 336 
Ybars 324 

Y bars test signal 324 
Y/C connector 69 
Y/C output skew 421 

O 

object description framework 749 

OHCI 181 



OpenCable 778 
audio 780 

closed captioning 710 
conditional access 791 
data broadcasting 790 
DCAS 791 
descriptors 704, 784 
DOCSIS 792 
EIA-679 791 
NRSS-B 791 
PacketCable 792 
service information 780 
SI tables 780 
video 780 
openLDI 170 
openLVDS 170 
oversampled VBI data 381 

P 

P frame 466, 481, 544 
P pictures 585 
P slice 760 
PYOP 741 

packet identification code (PID) 666 
PacketCable 792 
PAL 280 

channel assignments 295 
closed captioning 368 
formats 290 
NICAM 728 289 
noninterlaced 289 
overview 280 
RF modulation 285 
teletext 374 
timecode 337 
VBI data 337 
widescreen signaling 369 
WSS 



open GOP 545, 585 see widescreen signaling 

open host controller interface (OHCI) 181 
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PAL decoding 422 

10-step staircase test signal 325 

10T pulse 336 

20T pulse 336 

2D comb filter 451 

2T pulse 336 

3D comb filter 458 

adaptive comb filter 457 

alpha 458 

alpha channel 458 

auto detect 446 

automatic gain control 424 

BT.470 465 

chrominance demodulation 425 
chrominance nonlinear gain 464 
chrominance nonlinear phase 464 
chrominance-luminance intermodulation 
464 

color bars test signal 312 
color burst detection 433 
color saturation accuracy 465 
comb filter 451 
complementary filtering 447 
composite video digitizing 422 
cross-color 446 
cross-luminance 446 
deinterlacing 243 
differential gain 461 
differential phase 461 
EBU color bars 322 
euroconnector 69 
field identification 446 
field signal 445 

field square wave test signal 330 

filtering 428 

gamma 36 

genlocking 436 

H tilt 465 

Hanover bars 428 

horizontal blanking 444 

horizontal sync 444 

hue accuracy 464 



hue adjustment 432 

inter-field comb filter 458 

interlaced-to-noninterlaced conversion 243 

intra-field comb filter 451 

ITU multiburst test signal 328 

ITU-R BT.470 465 

line bar test signal 328 

luminance nonlinearity 461 

modulated pedestal test signal 326 

modulated ramp test signal 325 

modulated staircase test signal 326 

motion adaptive Y/C separation 458 

multiburst test signal 328 

multipulse test signal 328 

PAL delay line 449 

PAL modifier 449 

PAL switch 444 

peritel connector 69 

peritelevision connector 69 

PLUGE test signal 323 

progressive scan conversion 243 

red field test signal 325 

reverse blue bars test signal 322 

SCART connector 69 

simple PAL decoder 449 

subcarrier generation 441 

subcarrier locking 441 

S-video connector 69 

T pulse 336 

V tilt 465 

vertical blanking 445 
vertical sync 444 
video parameters 461 

chrominance nonlinear gain 464 
chrominance nonlinear phase 464 
chrominance-luminance intermodula- 
tion 464 

color saturation accuracy 465 
differential gain 461 
differential luminance 461 
differential phase 461 
H tilt 465 
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hue accuracy 464 
luminance nonlinearity 461 
V tilt 465 

video test signals 312 
10-step staircase 325 
10T pulse 336 
20T pulse 336 
2T pulse 336 
color bars 312 
combination 333 
composite 330 
EBU color bars 322 
field square wave 330 
ITU multiburst 328 
line bar 328 

modulated pedestal 326 
modulated ramp 325 
modulated staircase 326 
multiburst 328 
multipulse 328 
PLUGE 323 
red field 325 
reverse blue bars 322 
T pulse 336 
Ybars 324 

video timing generation 444 

Y bars test signal 324 

Y/C connector 69 

Y/C separation 446 
2D comb filter 451 
3D comb filter 458 
adaptive comb filter 457 
comb filter 451 
complementary filtering 447 
inter-field comb filter 458 
intra-field comb filter 451 
motion adaptive 458 
noncomplementary filtering 447 
PAL delay line 449 
PAL modifier 449 
simple 447 
PAL delay line 449 



PAL encoding 389 

10-step staircase test signal 325 

10T pulse 336 

20T pulse 336 

2T pulse 336 

alpha 422 

alpha channel 422 

bandwidth-limited edge generation 416 
black burst 406 
Bruch blanking 414 
BT.470 465 
burst generation 402 
chrominance frequency spectra 399 
chrominance nonlinear gain 419 
chrominance nonlinear phase 419 
chrominance-luminance intermodulation 
419 

clean encoding 415 
color bars test signal 312 
color saturation accuracy 420 
composite video generation 404 
differential gain 419 
differential luminance 419 
differential phase 417 
EBU chrominance color bars 403 
EBU color bars 322 
euroconnector 69 
field identification 414 
field square wave test signal 330 
filtering 396 
gamma 36 
genlocking 421 
H tilt 421 

horizontal timing 411 
house sync 406 
hue accuracy 420 
ITU multiburst test signal 328 
ITU-R BT.470 465 
line bar test signal 328 
luminance generation 393 
luminance nonlinearity 419 
meander gate 414 
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modulated pedestal test signal 326 

modulated ramp test signal 325 

modulated staircase test signal 326 

multiburst test signal 328 

multipulse test signal 328 

noninterlaced-to-interlaced conversion 241 

PAL switch 407 

peritel connector 69 

peritelevision connector 69 

PLUGE test signal 323 

red field test signal 325 

residual subcarrier 420 

reverse blue bars test signal 322 

SCART connector 69 

SCH phase 420 

subcarrier generation 407 

S-video connector 69 

S-video output skew 421 

T pulse 336 

V tilt 421 

vertical timing 411 
video levels 

chrominance (C) 403 
composite 405 
luminance (Y) 395 
video parameters 417 

chrominance nonlinear gain 419 
chrominance nonlinear phase 419 
chrominance-luminance intermodula- 
tion 419 

color saturation accuracy 420 
differential gain 419 
differential luminance 419 
differential phase 417 
H tilt 421 
hue accuracy 420 
luminance nonlinearity 419 
residual subcarrier 420 
SCH phase 420 
V tilt 421 

Y/C output skew 421 



video test signals 312 
10-step staircase 325 
10T pulse 336 
20T pulse 336 
2T pulse 336 
color bars 312 
combination 333 
composite 330 
EBU color bars 322 
field square wave 330 
ITU multiburst 328 
line bar 328 

modulated pedestal 326 
modulated ramp 325 
modulated staircase 326 
multiburst 328 
multipulse 328 
PLUGE 323 
red field 325 
reverse blue bars 322 
T pulse 336 
Ybars 324 

Y bars test signal 324 
Y/ C connector 69 
Y/ C output skew 421 
PAL modifier 449 
PAL switch 407, 444 
PALplus 300 

partial content announcement table 817 
PAT 668 
PB frame 481 
improved 481 
PC AT 817 
PCR 735 
PDC 378 
peaking filter 200 
peritel connector 69 
peritelevision connector 69 
picture control 198 
PID 666 

PLUGE test signal 322 
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PMT 670 RGB 

predicted frame 466, 481, 544 color bars 16 

predicted pictures 585 color space 15 

predicted slice 760 RGB interface 

presentation time stamp (PTS) 732 HDTV 75 

program and system information protocol 768, digitization 77 



772 

program association table (PAT) 668 
program clock reference (PCR) 735 
program delivery control (PDC) 378 
program map table (PMT) 670 
program stream 656 
progressive DCT 540 
progressive scan conversion 243 
PSIP 768, 772 
PSIP-E 772 
PTS 732 

Q 

QSIF48 

square pixel 48 
quantization 248 
quantizing 248 
H.261 471 
H.263 482 
MPEG-1 547 
MPEG-2 589 

R 

rawVBI data 381 

real-time control protocol (RTCP) 833 
real-time streaming protocol (RTSP) 828 
real-time transport protocol (RTP) 830 
red field test signal 325 
region rating table 768 
residual subcarrier 420 
resource reservation protocol (RSVP) 834 
reverse blue bars test signal 322 
RF modulation 
NTSC 265 
PAL 285 



generation 75 
SDTV71 

0 IRE blanking pedestal 74 
7.5 IRE blanking pedestal 71 
digitization 74, 75 
generation 71, 74 
VGA 100 
rounding 193 

conventional 193 
dynamic 193 
error feedback 193 
truncation 193 
RRT768 
RST 804, 817 
RSVP 834 
RTCP 833 
RTP 830 
RTSP 828 

run length coding 250 
running status table 804, 817 

S 

SVOP 741 

saturation control 198 
SBP-2 181 
scalability 

data partitioning 584 
SNR scalability 584 
spatial scalability 584 
temporal scalability 584 
scaling 223 

anti-aliased resampling 224 
bilinear interpolation 224 
Bresenham algorithm 224 
linear interpolation 224 
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pixel dropping 224 
pixel duplication 224 
scan rate conversion 227 
frame dropping 232 
frame duplication 232 
motion compensation 234 
temporal interpolation 234 
SCART connector 69 
scene description 751 
SCH phase 129, 130, 420 
scRGB 17, 26 
SCTE 07 778 
SCTE 18 778 
SCTE 20 710, 778 
SCTE 21 710 
SCTE 26 778 
SCTE 27 720 
SCTE 40 778 
SCTE 42 727 
SCTE 43 778 
SCTE 54 778 
SCTE 55 778 
SCTE 65 778 
SCTE 80 778 
SDI 128 
SDT 799, 817 
SDTI 143 
DY 535 
SDTT817 
SECAM 303 

4-field sequence 306 
formats 307 
gamma 36 
overview 303 
SEI messages 758 
selection information table 804, 817 
sequential DCT 539 
serial bus protocol 181 
serial data transport interface 143 
service description table 799, 817 



service information 
ATSC 768 
DVB 798 
ISDB 816 
OpenCable 780 
shadow chroma keying 215 
sharpness control 200 
SI slice 760 
SI table 

ATSC 768 

cable virtual channel table (CVCT) 768 
CVCT768 

data event table (DET) 773 
data service table (DST) 774 
DCCSCT 770 
DCCT770 
DET 773 

directed channel change selection code 
table (DCCSCT) 770 
directed channel change table (DCCT) 
770 

DST 774 
EIT 768 
ETT 768, 773 

event information table (EIT) 768 
E-VSB 

PSIP-E 772 

extended text table (ETT) 768, 773 
long-term service table (LTST) 773 
LTST773 

master guide table (MGT) 768 
MGT 768 

network resources table (NRT) 774 

NRT774 

PSIP-E 772 

rating region table (RRT) 768 

RRT768 

STT768 

system time table (STT) 768 
terrestrial virtual channel table (TV CT) 
768 
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TVCT768 

VCT768 

virtual channel table (VCT) 768 
DVB 798 
BAT 799 

bouquet association table (BAT) 799 
discontinuity information table (DIT) 
799 
DIT 799 
EIT 798 

event information table (EIT) 798 
network information table (NIT) 799 
NIT 799 
RST804 

running status table (RST) 804 
SDT 799 

selection information table (SIT) 804 
service description table (SDT) 799 
SIT 804 
ST 804 

stuffing table 804 
TDT 799 

time and date table (TDT) 799 
time offset table (TOT) 804 
TOT 804 
ISDB 816 
AIT 816 

application information table (AIT) 816 
BAT 816 
BIT 816 

bouquet association table (BAT) 816 
broadcaster information table (BIT) 
816 

CDT816 

common data table (CDT) 816 
DOT 816 

discontinuity information table (DIT) 
816 
DIT 816 
DLT816 

download control table (DCT) 816 
download table (DLT) 816 



EIT 816 
ERT 816 

event information table (EIT) 816 
event relation table (ERT) 816 
index transmission table (ITT) 816 
ITT 816 
LDT 816 

linked description table (LDT) 816 
LIT 816 

local event information table (LIT) 816 
NBIT 817 

network board information table 
(NBIT) 817 

network information table (NIT) 817 
NIT 817 

partial content announcement table 
(PCAT) 817 
PC AT 817 
RST 817 

running status table (RST) 817 

SDT 817 

SDTT817 

selection information table (SIT) 817 
service description table (SDT) 817 
SIT 817 

software download trigger table 
(SDTT) 817 
ST 817 

stuffing table (ST) 817 
TDT 817 

time and date table (TDT) 817 
time offset table (TOT) 817 
TOT 817 
OpenCable 
ADET791 
AEIT 786 
AETT786 

aggregate data event table (ADET) 791 
aggregate event information table 
(AEIT) 786 

aggregate extended text table (AETT) 
786 
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cable virtual channel table (CVCT) 781 
CVCT781 

data event table (DET) 790 
data service table (DST) 791 
DCCSCT 782 
DCCT782 
DET 790 

directed channel change selection code 
table (DCCSCT) 782 
directed channel change table (DCCT) 
782 
DST 791 
EA 782, 786 
EIT 782 

emergency alert (EA) table 782, 786 
ETT 782, 790 

event information table (EIT) 782 
extended text table (ETT) 782, 790 
long-form virtual channel table 786 
long-term service table (LTST) 790 
LTST790 

master guide table (MGT) 782, 786 
MGT 782, 786 

network information table (NIT) 786 
network resources table (NRT) 791 
network text table (NTT) 786 
NIT 786 
NRT 791 
NTT 786 

rating region table (RRT) 782, 786 
RRT 782, 786 

short-form virtual channel table 786 
STT782, 788 

system time table (STT) 782, 788 

SIF 48 

square pixel 48 
simulcrypt 809 
sine-squared pulse 336, 416 
SIT 804, 817 
skin tone correction 432 
sliced VBI data 381 
SL-packetized stream 753 
SMPTE 125M 116, 118 



SMPTE 12M 337 
SMPTE 170M 386 
SMPTE 221M 535 
SMPTE 222M 535 
SMPTE 244M 136 
SMPTE 259M 128 
SMPTE 262M 345 
SMPTE 267M 41, 112, 116, 118 
SMPTE 274M 62, 64,114 
SMPTE 292M 128 
SMPTE 293M 45 
SMPTE 296M 56 
SMPTE 305M 143 
SMPTE 309M 345 
SMPTE 314M 515 
SMPTE 344M 128 
SMPTE 348M 144 
SMPTE 370M 536 

SMPTE 421M video over MPEG-2 transport 
stream 675 

SMPTE bars test signal 322 

SNR scalability 584 

software download trigger table 817 

SP slice 760 

spatial scalability 584 

spectrum locus 28 

sRGB 16 

ST 804, 817 

STT 768 

studio RGB 19, 20, 21 
stuffing table 804, 817 
subcarrier generation 407, 441 
subcarrier locking 441 
subtitles 

digital cable 720 
DVB 724 
ISDB 825 
MPEG-2 720 
super block 522 
superblack keying 222 

Supplemental Enhancement Information (SEI) 
messages 758 
S-video 394, 402 
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S-video interface 68 
extended 69 
S-video output skew 421 
sync layer 753 
system bitstream 570 
system time table 768 

T 

T pulse 336, 416 
T step 416 
TDT 799, 817 
teletext 374 

DVB 712, 717 
MPEG-2 717 

temporal interpolation 234 
temporal rate conversion 227 
temporal scalability 584 
terrestrial virtual channel table 768 
text overlay 204 

three-level chrominance bar test signal 326 

time and date table 799, 817 

time offset table 804, 817 

time stamps 732 

timecode 337 

drop frame 338 
jam sync 338 
longitudinal timecode 338 
LTC 338 
user data 346 

vertical interval timecode 341 
VITC 341 
tint control 198 
TOT 804, 817 
TP_extra_header 662 
transport interfaces 143 
BT.1381 143 
BT.1577 144 
HD-SDTI 144 

high-data-rate serial data transport inter- 
face 144 
IEEE 1394 174 
ITU-R BT.1381 143 
ITU-R BT.1577 144 



SDTI 143 

serial data transport interface 143 
SMPTE 305M 143 
SMPTE 348M 144 
transport stream 661 

transport stream description table (TSDT) 671 
tristimulus values 31 
TSDT 671 
TVCT768 

U 

unicast 828 
user controls 198 

black level adjustment 198 
brightness adjustment 198 
color adjustment 198 
contrast adjustment 198 
hue adjustment 198 
picture adjustment 198 
saturation adjustment 198 
sharpness adjustment 200 
tint adjustment 198 
white level adjustment 198 

V 

V tilt 421, 465 

V timing 108 

variable interpolation 245 
varispeed 241 
VBI data 337 

AMOL381, 703,844 

ancillary data 108 

CGMS 82, 84, 87, 94, 96, 362, 369 

closed captioning 346 

ghost cancellation 383 

NABTS 374 

oversampled 381 

raw 381 

sliced 381 

teletext 374 

timecode 337 

vertical interval timecode 341 
VITC 341 
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widescreen signaling 87, 300, 369, 372, 703, 
712 
WSS 

see widescreen signaling 
VC-1 video over MPEG-2 transport stream 675 
V-chip 359, 768, 782, 786 
VCT768 

vertical interval timecode 341 
VGA connector 100 
video bitstream 
H.261 472 
H.263 484 
MPEG-1 551 
MPEG-2 591 
video conferencing 466 
video description 266 
video interface port 158 
video interface, analog 68 
Betacam 100 
D-connector 97 
EBU N10 100 
euroconnector 69 
Mil 100 
peritel 69 
peritelevision 69 
RGB HDTV 75 
RGB SDTV 71 
SCART 69 
SMPTE 100 
S-video 68 
VGA 100 
YPbPr HDTV 90 
YPbPr SDTV 77 

video interface, digital, consumer 162, 174 
DFP 168 

digital flat panel interface 168 
digital visual interface (DVT) 162 
DVI 162 

gigabit video interface 172 
GVIF 172 
HDMI 167 

high-definition multimedia interface (HD- 
MI) 167 



open LVDS display interface 170 
openLDI 170 

video interface, digital, IC 149 
BT.656 156 

video interface port 158 
video module interface 154 
VIP 158 
VMI 154 

video interface, digital, pro-video 106 
BT.1120 114, 128 
BT.1302 112, 128 
BT.1303 116 
BT.1362 129 
BT.656 112, 128 
BT.799 116 
SDI 128 

SMPTE 125M 112 
SMPTE 259M 128 
SMPTE 267M 112 
SMPTE 274M 114 
SMPTE 292M 128 
SMPTE 294M 129 
SMPTE 344M 128 
video levels 
NTSC 

chrominance (C) 403 
composite 404 
luminance (Y) 395 
PAL 

chrominance (C) 403 
composite 405 
luminance (Y) 395 
RGB 71 
YPbPr 77 
video mixing 204 
video module interface 154 
video over IP 827 
ARIB 835 
broadcast 828 
conditional access 835 
digital rights management (DRM) 835 
DRM 835 
DVD-IPI 835 
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multicast 828 
unicast 828 

video parameters 417, 461 

chrominance nonlinear gain 419, 464 
chrominance nonlinear phase 419, 464 
chrominance-to-luminance intermodula- 
tion 419 

color saturation accuracy 420, 465 

differential gain 419, 461 

differential luminance 419, 461 

differential phase 417, 461 

H tilt 421, 465 

hue accuracy 420, 464 

luminance nonlinearity 419, 461 

residual subcarrier 420 

SCH phase 420 

V tilt 421, 465 

Y/C output skew 421 

video processing 192 
alpha mixing 204 
black level control 198 
blue stretch 202 
brightness control 198 
chroma keying 214 
color control 198 
color transient improvement 200 
composite chroma keying 222 
contrast control 198 
coring 200 
deinterlacing 243 
dynamic contrast 202 
frame rate conversion 227 
graphics overlay 204 
green enhancement 202 
hue control 198 

interlaced-to-noninterlaced conversion 243 
keying 211 

luma transient improvement 200 
luminance keying 211 
noninterlaced-to-interlaced conversion 241 
peaking filter 200 
picture control 198 
rounding 193 



saturation control 198 
scaling 223 

anti-aliased resampling 224 
bilinear interpolation 224 
Bresenham algorithm 224 
linear interpolation 224 
pixel dropping 224 
pixel duplication 224 
scan rate conversion 227 
sharpness 200 
skin tone correction 432 
text overlay 204 
tint control 198 
video mixing 204 
white level control 198 
video programming system (VPS) 378 
video test signals 312 
color bars 
NTSC 316 
PAL 319 
NTSC/PAL 

10-step staircase 325 
10T pulse 336 
12. 5T pulse 336 
20T pulse 336 
25T pulse 336 
2T pulse 336 
color bars 312 
combination 333 
composite 330 
EBU color bars 322 
EIA color bars 317 
field square wave 330 
ITU multiburst 328 
line bar 328 

modulated pedestal 326 
modulated ramp 325 
modulated staircase 326 
multiburst 328 
multipulse 328 
NTC-7 combination 333 
NTC-7 composite 330 
PLUGE 323 
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red field 325 
reverse blue bars 322 
sine-squared pulse 336 
SMPTE bars 322 
T pulse 336 

three-level chrominance bar 326 
Y bars 324 

VIP 158 

virtual channel table 768 
visual scene 747 
VITC 341 
VMI 154 
VPS 378 

W 

white color 203 
white level control 198 
white stretch 202 
wide keying 221 
widescreen signaling 87, 369 
EIAJ CPR-1204 372 
ETSI EN300294 369 
IEC 61880 372 
ITU-RBT.1119 369 
PALplus 300 
WSS 

see widescreen signaling 

X 

xvYCC color space 26 



Y 

Y bars test signal 324 
Y/ C output skew 421 
Y/ C separation 446 

2D comb filter 451 
3D comb filter 458 
adaptive comb filter 457 
comb filter 451 
complementary filtering 447 
inter-field 458 
intra-field 451 
motion adaptive 458 
noncomplementary filtering 447 
PAL delay line 449 
PAL modifier 449 
simple 447 
Y/C video 394, 402 
YCbCr 19 

4:1:1 format 22 
4:2:0 format 22 
4:2:2 format 22 
4:4:4 format 21 
4:4:4 to 4:2:2 195 
color space 19 
YIQ 18 

YPbPr analog video 77, 90 
YPbPr interface 
Betacam 100 
D-connector 97 
EBU N10 100 
HDTV 90 
Mil 100 
SDTV77 
SMPTE 100 
YUV17 
YUV12 22 
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Z 

zero SCH phase 129, 130 
zig-zag scan 250 
H.261 250, 476 
H.263 250, 508 
MPEG-1 250, 547 
MPEG-2 250, 589 
MPEG-4.2 250 
Zweiton 289 




